You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Jakob Homan (JIRA)" <ji...@apache.org> on 2009/03/16 22:59:50 UTC

[jira] Updated: (HADOOP-5467) Create an offline fsimage image viewer

     [ https://issues.apache.org/jira/browse/HADOOP-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jakob Homan updated HADOOP-5467:
--------------------------------

    Attachment: fsimage.xml
                HADOOP-5467.patch

Done with first pass at offline image viewer.  Still need to do unit tests and documentation, but looking for feedback.  

The offline image viewer will process fsimage files of layout versions -18 or -19, creating several types of human-readable output.  For instance, with the following (contrived) namespace:
{noformat}
drwxr-xr-x   - jhoman supergroup          0 2009-03-16 21:17 /anotherDir
-rw-r--r--   3 jhoman supergroup  286631664 2009-03-16 21:15 /anotherDir/biggerfile
-rw-r--r--   3 jhoman supergroup       8754 2009-03-16 21:17 /anotherDir/smallFile
drwxr-xr-x   - jhoman supergroup          0 2009-03-16 21:11 /mapredsystem
drwxr-xr-x   - jhoman supergroup          0 2009-03-16 21:11 /mapredsystem/jhoman
drwxr-xr-x   - jhoman supergroup          0 2009-03-16 21:11 /mapredsystem/jhoman/mapredsystem
drwx-wx-wx   - jhoman supergroup          0 2009-03-16 21:11 /mapredsystem/jhoman/mapredsystem/ip.redacted.com
drwxr-xr-x   - jhoman supergroup          0 2009-03-16 21:12 /one
drwxr-xr-x   - jhoman supergroup          0 2009-03-16 21:12 /one/two
drwxr-xr-x   - jhoman supergroup          0 2009-03-16 21:16 /user
drwxr-xr-x   - jhoman supergroup          0 2009-03-16 21:19 /user/jhoman
{noformat}

using the default image processor, which mimics the output of ls, generates this:
{noformat}
[1233]mymac:hadoop-0.21.0-dev jhoman$ bin/hadoop offlineimageviewer -i fsimagedemo 
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:16 /
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:17 /anotherDir
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:11 /mapredsystem
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:12 /one
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:16 /user
-rw-r--r--  3   jhoman supergroup    286631664 2009-03-16 14:15 /anotherDir/biggerfile
-rw-r--r--  3   jhoman supergroup         8754 2009-03-16 14:17 /anotherDir/smallFile
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:11 /mapredsystem/jhoman
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:11 /mapredsystem/jhoman/mapredsystem
drwx-wx-wx  -   jhoman supergroup            0 2009-03-16 14:11 /mapredsystem/jhoman/mapredsystem/ip.redacted.com
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:12 /one/two
drwxr-xr-x  -   jhoman supergroup            0 2009-03-16 14:19 /user/jhoman
{noformat}
The line ordering is a different, but this output is very amenable to further processing using standard unix tools and should look familiar to everyone.

Another image processor, Console, displays the namespace in a more verbose format that includes individual block entries and any inodes that are under construction in the fsimage:
{noformat}
[1233]mymac:hadoop-0.21.0-dev jhoman$ bin/hadoop offlineimageviewer -i fsimagedemo -p Console
FSImage
  ImageVersion = -19
  NamespaceID = 2109123098
  GenerationStamp = 1003
  INodes [NumInodes = 12]
    Inode
      INodePath = 
      Replication = 0
      ModificationTime = 2009-03-16 14:16
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = 2147483647
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
    Inode
      INodePath = /anotherDir
      Replication = 0
      ModificationTime = 2009-03-16 14:17
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
    Inode
      INodePath = /mapredsystem
      Replication = 0
      ModificationTime = 2009-03-16 14:11
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
    Inode
      INodePath = /one
      Replication = 0
      ModificationTime = 2009-03-16 14:12
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
    Inode
      INodePath = /user
      Replication = 0
      ModificationTime = 2009-03-16 14:16
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
    Inode
      INodePath = /anotherDir/biggerfile
      Replication = 3
      ModificationTime = 2009-03-16 14:15
      AccessTime = 2009-03-16 14:15
      BlockSize = 134217728
      Blocks [NumBlocks = 3]
        Block
          BlockID = -3825289017228345116
          NumBytes = 134217728
          GenerationStamp = 1002
        Block
          BlockID = -561951562131659349
          NumBytes = 134217728
          GenerationStamp = 1002
        Block
          BlockID = 524543674153268996
          NumBytes = 18196208
          GenerationStamp = 1002
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rw-r--r--
    Inode
      INodePath = /anotherDir/smallFile
      Replication = 3
      ModificationTime = 2009-03-16 14:17
      AccessTime = 2009-03-16 14:17
      BlockSize = 134217728
      Blocks [NumBlocks = 1]
        Block
          BlockID = 4922053134320058874
          NumBytes = 8754
          GenerationStamp = 1003
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rw-r--r--
    Inode
      INodePath = /mapredsystem/jhoman
      Replication = 0
      ModificationTime = 2009-03-16 14:11
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
    Inode
      INodePath = /mapredsystem/jhoman/mapredsystem
      Replication = 0
      ModificationTime = 2009-03-16 14:11
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
    Inode
      INodePath = /mapredsystem/jhoman/mapredsystem/ip-redacted.com
      Replication = 0
      ModificationTime = 2009-03-16 14:11
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwx-wx-wx
    Inode
      INodePath = /one/two
      Replication = 0
      ModificationTime = 2009-03-16 14:12
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
    Inode
      INodePath = /user/jhoman
      Replication = 0
      ModificationTime = 2009-03-16 14:19
      AccessTime = 1969-12-31 16:00
      BlockSize = 0
      Blocks [NumBlocks = -1]
      NSQuota = -1
      DSQuota = -1
      Permissions
        Username = jhoman
        GroupName = supergroup
        PermString = rwxr-xr-x
  INodesUnderConstruction [NumINodesUnderConstruction = 0]
{noformat}

The final current processor implemented is XML, which generates an XML file of the entire structure.  I've attached the sample output of this.  I think this may be the most interesting because it allows easy automated processing.  However, it's also quite verbose.  On a cluster here with about 93k files, the resulting XML was 2.7 million lines.  However, TextMate was able to handle the output with little grumbling!

One option worth noting is -skipBlocks.  In namespaces with a large number of files that span several blocks, this option causes the individual blocks to be omitted, including only the block count.  Under this namespace distribution profile, this option will significantly decrease the size of the output.

It should be pretty easy to write new image processors and output formats as needed.  I'll work on testing and documentation and upload a patch soon.

> Create an offline fsimage image viewer
> --------------------------------------
>
>                 Key: HADOOP-5467
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5467
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: util
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: fsimage.xml, HADOOP-5467.patch
>
>
> It would be useful to have a tool to examine/dump the contents of the fsimage file to human-readable form.  This would allow analysis of the namespace (file usage, block sizes, etc) without impacting the operation of the namenode.  XML would be reasonable output format, as it can be easily viewed, compressed and manipulated via either XSLT or XQuery.  
> I've started work on this and will have an initial version soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.