You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by "Martin Bukatovic (JIRA)" <ji...@apache.org> on 2014/06/24 14:30:25 UTC

[jira] [Comment Edited] (BIGTOP-1342) Make TestCLI usable for both HDFS and HCFS

    [ https://issues.apache.org/jira/browse/BIGTOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042037#comment-14042037 ] 

Martin Bukatovic edited comment on BIGTOP-1342 at 6/24/14 12:29 PM:
--------------------------------------------------------------------

*The idea behind initial patch ({{BIGTOP-1342.1.patch}})*:

This patch expects that patch proposed in BIGTOP-1341 is applied.

Classess {{TestCLI.java}} and {{FSCmdExecutor.java}} were changed to be hcfs
compliant and moved from hdfs into hcfs module. The full class name
of TestCLI is {{org.apache.bigtop.itest.hadoop.hcfs.TestCLI}} now.

Being in hcfs module TestCLI can be used to test any Hadoop filesystem like
HDFS or GlusterFS (no other filesystem were checked though). That said, HDFS is
still a primary filesystem for TestCLI, which means that you can run TestCLI on
HDFS and it would work without any additional configuration. On the other hand
to run TestCLI on GlusterFS, you need to set several properties I have
introduced (you don't need to this with HDFS because TestCLI has hardcoded HDFS
specific defaults for all of them). TestCLI has it's test cases defined in
{{testHCFSConf.xml}}.

Since some test cases are HDFS specific (not applicable to any other Hadoop
fs), I moved those test cases into separate file {{testHDFSConf.xml}} and added
TestHDFSCLI into hdfs module (subclass of TestCLI). There are just few test
cases in this category, but I expect that all the dfsadmin ones from
BIGTOP-1334 would end up here.

The idea is that with HDFS, you would run both TestCLI and TestHDFSCLI, while
with GlusterFS (on any other hadoop compatible fs), you would run just TestCLI.

*Changes of TestCLI to make it hcfs ready*

 * HCFS fs.default.name Hack (see {{TestCLI.java}} source to more details)
 * added property {{hcfs.namenode.pattern}}: regexp to match namenode
   (sometimes TestCLI expected to see namenode hostname with port number in
   the output, but this is not applicable to other filesystems)
 * added property {{hcfs.root.groupname}}: root group
   (root group is defined in Hadoop config {{dfs.permissions.superusergroup}}
   for HDFS, while it's always just {{root}} for GlusterFS)
 * added property {{hcfs.dirsize.pattern}}: regexp to match expected dir size
   (this is because HDFS reports zero for size of directories while GlusterFS
   doesn't)
 * added property {{hcfs.scheme}}: defines fs scheme (eg. "hdfs:", * "glusterfs:")
 * removed few HDFS specific cases from {{testHCFSConf.xml}} (moved to
   {{testHDFSConf.xml}})

Note: for GlusterFS test runs, I use the following values:

{noformat}
  -Dhcfs.root.username=root
  -Dhcfs.root.groupname=root
  -Dhcfs.scheme=glusterfs:
  -Dhcfs.dirsize.pattern='[0-9]+'
  -Dhcfs.namenode.pattern='' 
{noformat}



was (Author: mbukatov):
*The idea behind initial patch ({{BIGTOP-1342.1.patch}})*:

This patch expects that patch proposed in BIGTOP-1341 is applied.

Classess {{TestCLI.java}} and {{FSCmdExecutor.java}} were changed to be hcfs
compliant and moved from hdfs into hcfs module. The full class name
of TestCLI is {{org.apache.bigtop.itest.hadoop.hcfs.TestCLI}} now.

Being in hcfs module TestCLI can be used to test any Hadoop filesystem like
HDFS or GlusterFS (no other filesystem were checked though). That said, HDFS is
still a primary filesystem for TestCLI, which means that you can run TestCLI on
HDFS and it would work without any additional configuration. On the other hand
to run TestCLI on GlusterFS, you need to set several properties I have
introduced (you don't need to this with HDFS because TestCLI has hardcoded HDFS
specific defaults for all of them). TestCLI has it's test cases defined in
{{testHCFSConf.xml}}.

Since some test cases are HDFS specific (not applicable to any other Hadoop
fs), I moved those test cases into separate file {{testHDFSConf.xml}} and added
TestHDFSCLI into hdfs module (subclass of TestCLI). There are just few test
cases in this category, but I expect that all the dfsadmin ones from
BIGTOP-1334 would end up here.

The idea is that with HDFS, you would run both TestCLI and TestHDFSCLI, while
with GlusterFS (on any other hadoop compatible fs), you would run just TestCLI.

*Changes of TestCLI to make it hcfs ready*

 * HCFS fs.default.name Hack (see {{TestCLI.java}} source to more details)
 * added property {{hcfs.namenode.pattern}}: regexp to match namenode
   (sometimes TestCLI expected to see namenode hostname with port number in
   the output, but this is not applicable to other filesystems)
 * added property {{hcfs.root.groupname}}: root group
   (root group is defined in Hadoop config {{dfs.permissions.superusergroup}}
   for HDFS, while it's always just {{root}} for GlusterFS)
 * added property {{hcfs.dirsize.pattern}}: regexp to match expected dir size
   (this is because HDFS reports zero for size of directories while GlusterFS
   doesn't)
 * added property {{hcfs.scheme}}: defines fs scheme (eg. "hdfs:", * "glusterfs:")
 * removed few HDFS specific cases from {{testHCFSConf.xml}} (moved to
   {{testHDFSConf.xml}})


> Make TestCLI usable for both HDFS and HCFS
> ------------------------------------------
>
>                 Key: BIGTOP-1342
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1342
>             Project: Bigtop
>          Issue Type: Improvement
>          Components: Tests
>    Affects Versions: 0.8.0
>            Reporter: Martin Bukatovic
>         Attachments: BIGTOP-1342.1.patch
>
>
> Current TestCLI test cases are currently only runnable on HDFS.  Since the most
> test cases are applicable on any hadoop filesystem, it makes sense to
> make it general in hcfs sense.
> While most test cases are hcfs generic, some cases are only applicable to HDFS,
> so I propose split the current code into:
>  * general HCFS superclass (with most cases in {{testHCFSConf.xml}} file)
>  * HDFS specific subclass (with hdfs only cases in {{testHDFSConf.xml}})
> I would like to keep {{testHCFSConf.xml}} as a common base for any hadoop
> filesystem, which would require introduction of several additional variables to
> catch minor differences between GlusterFS and HDFS. This should be good enough
> for other hcfs implementations as well, but I didn't tested it.
> Before proposing patch for this, it make sense to have the following resolved:
>  * BIGTOP-1341 TestCLI cleanup
>  * BIGTOP-1334 Add DFS support to TestCLI



--
This message was sent by Atlassian JIRA
(v6.2#6252)