You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by "Lei Zou (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2012/04/10 19:35:13 UTC
[jira] [Issue Comment Edited] (BIGTOP-476) Improvement of BigTop iTest framework

    [ https://issues.apache.org/jira/browse/BIGTOP-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250846#comment-13250846 ] 

Lei Zou edited comment on BIGTOP-476 at 4/10/12 5:33 PM:
---------------------------------------------------------

Thanks for your time for review and guidance. Putting the code under org.apache.bigtop.itest.executors.cli makes sense to me, we can better leverage the same functionality. I will take a closer look at the BigTop's architecture/packaging and see how this piece would fit. 

I have another question in my mind, and I put it out here: in the land of Hadoop clustered environment after the successful BigTop's deployment, how does BigTop check and see if all multiple datanodes (desired) participate in M/R job in a programmable way? For example, let's say we have one namenode and two datadanodes in Hadoop cluster, we like to see all three servers participate in the Hadoop job that calculates Pi. Verification is a success if all nodes have done some work, fail if anyone was not involved.      


                
      was (Author: stones333):
    Thanks for your time for review and guidance. Putting the code under org.apache.bigtop.itest.executors.cli makes sense to me, we can better leverage the same functionality. I will take a closer look at the BigTop's architecture/packaging and see how this piece would fit. 

I have another question in my mind, and I put it out here: in the land of Hadoop clustered environment after the successful BigTop's deployment, how does BigTop check and see if all multiple datanodes (desired) participate in M/R job in programmable way?    


                  
> Improvement of BigTop iTest framework
> -------------------------------------
>
>                 Key: BIGTOP-476
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-476
>             Project: Bigtop
>          Issue Type: Improvement
>          Components: Tests
>    Affects Versions: 0.3.0
>         Environment: All
>            Reporter: Lei Zou
>            Assignee: Lei Zou
>            Priority: Minor
>              Labels: test
>             Fix For: 0.4.0
>
>         Attachments: BigTopIntegrationTest.java, BigTopIntegrationTestFacade.groovy, BigTopTestCommand.java, ExtactComparatorIgnoreWhiteSpace.groovy, TestRunHadoopExamples.groovy, bigtop-testcases.yaml, bigtop-tests_test-artifacts_pom.xml.diff, bigtop-tests_test-execution_smokes_hadoop_pom.xml.diff
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Current BigTop test framework has a limitation to handle dynamic generated data. It's flexibility can be improved. 
> For org.apache.bigtop.itest.hadoopexamples.TestHadoopExamples
> Limitation: if someone wants to make any changes, he/she needs to modify 
> ./bigtop-tests/test-artifacts/hadoop/src/main/groovy/org/apache/bigtop/itest/hadoopexamples/TestHadoopExamples.groovy. It requires compilation before running. 
> For org.apache.bigtop.itest.hadooptests.TestTestCLI. The configuration file,
> ./build/hadoop/deb/hadoop-1.0.1/src/test/org/apache/hadoop/cli/testConf.xml, has entries like the following: 
>    <test> <!-- TESTED -->
>       <description>ls: file using relative path</description>
>       <test-commands>
>         <command>-fs NAMENODE -touchz file1</command>
>         <command>-fs NAMENODE -ls file1</command>
>       </test-commands>
>       <cleanup-commands>
>         <command>-fs NAMENODE -rm file1</command>
>       </cleanup-commands>
>       <comparators>
>         <comparator>
>           <type>TokenComparator</type>
>           <expected-output>Found 1 items</expected-output>
>         </comparator>
>         <comparator>
>           <type>RegexpComparator</type>
>           <expected-output>^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( )*/user/[a-z]*/file1</expected-output>
>         </comparator>
>       </comparators>
>     </test>
> Limitation: Put the expected-output, then perform string comparison is good, but still not flexible enough to handle dynamic generated data. For example, a program randomly generate key/value pairs, then submit M/R job to calculate sum (average) for each key. There is no way to calculate the result in advance to put down as expected-output. 
> I am proposing an improvement for BigTop's integration test. We can put all test cases in a XML file, which contains a list of command-sets; each command-set has command, command-comparator-type, command-comparator-compare-to. The command is for hadoop/hbase/hive command; command-comparator-type to specify Java class to perform comparison; command-comparator-compare-to is used to specify the shell command to generate expected output. 
> I put down 3 cases below:
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <bigtop-itest-suite>
> 	<bigtop-itest-suite-test>
> 		<test-name>Calculate summation in MR</test-name>
> 		<test-desc>Here is simple MR test to calculate sum</test-desc>
> 		<test-pre-integration-test>
> 		</test-pre-integration-test>
> 		<test-integration-test>
>             <command-set>
>             <command>hadoop jar ./target/LeiBigTop-1.1.jar com.lei.bigtop.hadoop.calsum.CalSum ./data ./output</command>
>             <command-comparator-type>com.lei.bigtop.hadoop.integration.test.ExtactComparatorIgnoreWhiteSpace</command-comparator-type>
>             <command-comparator-compare-to><![CDATA[ cat ./output/* ]]></command-comparator-compare-to>
>             </command-set>
> 		</test-integration-test>
> 		<test-post-integration-test>
> 		</test-post-integration-test>
>         </bigtop-itest-suite-test>
>         <bigtop-itest-suite-test>
>             <test-name>calculate pi</test-name>
>             <test-desc>calculate pi using hadoop MR</test-desc>
>             <test-pre-integration-test>
>             </test-pre-integration-test>
>             <test-integration-test>
>                 <command-set>
>                 <command>hadoop jar $HADOOP_HOME/hadoop-examples-0.*.0.jar pi 5 5</command>
>                 <command-comparator-type>org.apache.hadoop.cli.util.SubstringComparator</command-comparator-type>
>                 <command-comparator-compare-to><![CDATA[echo "Pi is 3.68"]]></command-comparator-compare-to>
>                 </command-set>
>             </test-integration-test>
>             <test-post-integration-test>
>             </test-post-integration-test>
>         </bigtop-itest-suite-test>
>         <bigtop-itest-suite-test>
>             <test-name>count word in MR</test-name>
>             <test-desc>count word in Hadoop MR</test-desc>
>             <test-pre-integration-test>
>                 <command-set><command>rm -rf ./wordcount</command></command-set>
>                 <command-set><command>rm -rf ./wordcount_out</command></command-set>
>                 <command-set><command>mkdir ./wordcount</command></command-set>
>                 <command-set><command><![CDATA[curl http://www.meetup.com/HandsOnProgrammingEvents/events/53837022/ | sed -e :a -e 's/<[^>]*>//g;/</N;//ba' | sed 's/&nbsp//g' | sed 's/^[ \t]*//;s/[ \t]*$//'  | sed '/^$/d' | sed '/"http[^"]*"/d' > ./wordcount/content]]></command></command-set>
>                 <command-set><command>hadoop fs -mkdir /wordcount</command></command-set>
>                 <command-set><command>hadoop fs -put ./wordcount/* /wordcount</command></command-set>
>             </test-pre-integration-test>
>                 <test-integration-test>
>                     <command-set><command>hadoop jar $HADOOP_HOME/hadoop-examples-0.*.0.jar wordcount /wordcount /wordcount_out</command></command-set>
>                     <command-set><command>mkdir ./wordcount_out</command></command-set>
>                     <command-set><command>hadoop fs -get /wordcount_out/* ./wordcount_out</command></command-set>
>                     <command-set><command>hadoop fs -rmr /wordcount</command></command-set>
>                     <command-set><command>hadoop fs -rmr /wordcount_out/</command></command-set>
>                 </test-integration-test>
>                 <test-post-integration-test>
>                     <command-set>
>                     <command>cat ./wordcount_out/* | grep  Roman | sed 's/[^0-9.]*\([0-9.]*\).*/\1/'</command>
>                     <command-comparator-type>com.lei.bigtop.hadoop.integration.test.ExtactComparatorIgnoreWhiteSpace</command-comparator-type>
>                     <command-comparator-compare-to><![CDATA[cat wordcount/* | grep -c Roman]]></command-comparator-compare-to>
>                     </command-set>
>                 </test-post-integration-test>
>         </bigtop-itest-suite-test>
> </bigtop-itest-suite>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira