You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2008/07/04 12:53:31 UTC

[jira] Created: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

if MiniDFS startup time could be improved, testing time would be reduced
------------------------------------------------------------------------

                 Key: HADOOP-3694
                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
             Project: Hadoop Core
          Issue Type: Improvement
          Components: test
    Affects Versions: 0.19.0
            Reporter: Steve Loughran


Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617748#action_12617748 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

2 tests failed. 

org.apache.hadoop.hdfs.server.namenode.TestCheckpoint.testCheckpoint

java.io.FileNotFoundException: http://localhost:41372/getimage?putimage=1&port=41391&machine=140.211.11.106&token=-16:32382157:0:1217291770000:1217291764363
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1239)
	at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:151)
	at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.putFSImage(SecondaryNameNode.java:274)
	at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314)
	at org.apache.hadoop.hdfs.server.namenode.TestCheckpoint.testCheckpoint(TestCheckpoint.java:557)

and  org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage.testInjection

java.io.IOException: Timedout while waiting for all blocks to  be replicated for /replication-test-file
	at org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage.waitForBlockReplication(TestInjectionForSimulatedStorage.java:100)
	at org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage.testInjection(TestInjectionForSimulatedStorage.java:151)


> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618047#action_12618047 ] 

Hadoop QA commented on HADOOP-3694:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12387146/HADOOP-3694.patch
  against trunk revision 680823.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 27 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2978/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2978/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2978/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2978/console

This message is automatically generated.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611546#action_12611546 ] 

Pi Song commented on HADOOP-3694:
---------------------------------

+1
This is killing me in Pig as well.
It seems to be something to do with Jetty which usually takes 15 seconds or more to start up

{noformat}
08/07/08 22:05:48 INFO fs.FSNamesystem: Registered FSNamesystemStatusMBean
08/07/08 22:06:03 INFO util.Credential: Checking Resource aliases
08/07/08 22:06:03 INFO http.HttpServer: Version Jetty/5.1.x

08/07/08 22:07:53 INFO mapred.MiniMRCluster: mapred.local.dir is /tmp/hadoop-root/mapred/local/2_0
08/07/08 22:08:08 INFO http.HttpServer: Version Jetty/5.1.x
08/07/08 22:08:08 INFO util.Container: Started HttpContext[/static,/static]
{noformat}


> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618498#action_12618498 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3694:
------------------------------------------------

> The failed test last night (TestFileCreation#testClientTriggeredLeaseRecovery) is unrelated. The problem there is that it uses Thread.interrupt() to stop threads that are writing to HDFS, but that's not reliable, since DFSClient catches and ignores InterruptedException in many places (which is a bug). So this test fails sporadically.

+1 on adding a boolean for stopping the SlowWriter threads.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this.


> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618641#action_12618641 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

I'm +1 for a commit. we can look for more speedups later, but when testing, every minute helps. 

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Attachment: HADOOP-3694.patch

Here's an even more conservative patch, which only updates MiniDFSCluster, and also includes a fix to make TestFileCreation more reliable.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611542#action_12611542 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

well, the only thing in there that would be slow is 

this.dir.loadFSImage(getNamespaceDirs(conf), startOpt);

and the startopt appears to be FORMAT in most miniDFS operations, so its time to format a test filesystem that is causing the delays.

Possibilities: 
* improve this time for very small fileystems (e.g. the test ones)
* create a ready to roll image during part of the test run; reuse this later. 

The other thing to look at is shutdown time...how long does it take for everything to go away?


> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Attachment: HADOOP-3694.patch

Here's a less ambitious version which passes all tests locally.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624773#action_12624773 ] 

Hudson commented on HADOOP-3694:
--------------------------------

Integrated in Hadoop-trunk #581 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/])

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618532#action_12618532 ] 

Hadoop QA commented on HADOOP-3694:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12387219/HADOOP-3694.patch
  against trunk revision 680823.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2986/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2986/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2986/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2986/console

This message is automatically generated.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617459#action_12617459 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

changing the ipc node from 0:0:0:0 (all interfaces) to localhost in src/test/hadoop-site.xml appears to help:

<property>
  <name>dfs.datanode.ipc.address</name>
  <value>localhost:50020</value>
  <description>address for datanodes is always the localhost</description>
</property>

We'll see how long it takes tonight.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12612083#action_12612083 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

Turning off IPv6 doesnt help, just changes the stack trace:

    [junit] BEA JRockit(R) R27.4.0-90-89592-1.6.0_02-20070928-1715-linux-x86_64
    [junit] "Main Thread" id=1 idx=0x4 tid=11333 prio=5 alive, in native
    [junit]     at java/net/Inet4AddressImpl.getHostByAddr([B)Ljava/lang/String;(Native Method)
    [junit]     at java/net/InetAddress$1.getHostByAddr(InetAddress.java:853)
    [junit]     at java/net/InetAddress.getHostFromNameService(InetAddress.java:533)
    [junit]     at java/net/InetAddress.getHostName(InetAddress.java:476)
    [junit]     at java/net/InetAddress.getHostName(InetAddress.java:448)
    [junit]     at java/net/InetSocketAddress.getHostName(InetSocketAddress.java:210)
    [junit]     at org/apache/hadoop/hdfs/server/namenode/FSNamesystem.initialize(FSNamesystem.java:316)
    [junit]     at org/apache/hadoop/hdfs/server/namenode/FSNamesystem.<init>(FSNamesystem.java:263)
    [junit]     at org/apache/hadoop/hdfs/server/namenode/NameNode.initialize(NameNode.java:159)
    [junit]     at org/apache/hadoop/hdfs/server/namenode/NameNode.start(NameNode.java:224)
    [junit]     ^-- Holding lock: org/apache/hadoop/hdfs/server/namenode/NameNode@0x1132d80[thin lock]
    [junit]     at org/apache/hadoop/util/Service.deploy(Service.java:261)
    [junit]     at org/apache/hadoop/hdfs/server/namenode/NameNode.createNameNode(NameNode.java:893)
    [junit]     at org/apache/hadoop/hdfs/MiniDFSCluster.<init>(MiniDFSCluster.java:274)
    [junit]     at org/apache/hadoop/hdfs/MiniDFSCluster.<init>(MiniDFSCluster.java:123)
    [junit]     at org/apache/hadoop/fs/TestCopyFiles.testCopyFromLocalToDfs(TestCopyFiles.java:271)
    [junit]     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)

Its the determining of the local hostname from the ipaddress that the DFS is listening on that usually appears in the stack trace in a startup. Its not that is is particularly slow, a few tens of seconds, but that being run on every testcase, it brings the system to its knees. its also why the CPU load during testing is slow: they are not CPU bound, but driven by network delays.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Status: Patch Available  (was: Open)

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611952#action_12611952 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

A quick search here points a bit at entropy related https delays
http://docs.codehaus.org/display/JETTY/Connectors+slow+to+startup

This is backed up by email from Greg Wilkins
"the culprit that comes to mind for startup delays is the fact that
jetty uses a securerandom for the session manager, which needs enough
entropy in the operating systems random source before it starts.

If the machine is newly started or other things have consumed that entropy,
then it can be slow starting while the OS waits for interrupts and
other such stuff to create random numbers.

A simple threaddump during this pause should answer this.

If it is that- there are several options - including not using
sessions, or a less secure random number generator."

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611965#action_12611965 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

Following up on Greg's suggestion, I dont think it is jetty. Whenever I kill -QUIT a test case I see my machine stuck in Inet6addressImpl.getHostByAddr()

I've seen this on 3 different test cases so far: different tests, all in getHostName(). This is on my machine on a managed network, with IPv6 as well as IPv4.

        [junit] Running org.apache.hadoop.hdfs.TestAbandonBlock
    [junit] ===== FULL THREAD DUMP ===============
    [junit] Wed Jul  9 11:02:07 2008
    [junit] BEA JRockit(R) R27.4.0-90-89592-1.6.0_02-20070928-1715-linux-x86_64
    [junit] "Main Thread" id=1 idx=0x4 tid=23092 prio=5 alive, in native
    [junit]     at java/net/Inet6AddressImpl.getHostByAddr([B)Ljava/lang/String;(Native Method)
    [junit]     at java/net/InetAddress$1.getHostByAddr(InetAddress.java:853)
    [junit]     at java/net/InetAddress.getHostFromNameService(InetAddress.java:533)
    [junit]     at java/net/InetAddress.getHostName(InetAddress.java:476)
    [junit]     at java/net/InetAddress.getHostName(InetAddress.java:448)
    [junit]     at java/net/InetSocketAddress.getHostName(InetSocketAddress.java:210)
    [junit]     at org/apache/hadoop/hdfs/server/datanode/DataNode.startDataNode(DataNode.java:414)
    [junit]     at org/apache/hadoop/hdfs/server/datanode/DataNode.start(DataNode.java:226)
    [junit]     at org/apache/hadoop/util/Service.deploy(Service.java:261)
    [junit]     at org/apache/hadoop/hdfs/server/datanode/DataNode.makeInstance(DataNode.java:3062)
    [junit]     at org/apache/hadoop/hdfs/server/datanode/DataNode.instantiateDataNode(DataNode.java:3013)
    [junit]     at org/apache/hadoop/hdfs/MiniDFSCluster.startDataNodes(MiniDFSCluster.java:411)
    [junit]     ^-- Holding lock: org/apache/hadoop/hdfs/MiniDFSCluster@0x3a5e4d0[thin lock]
    [junit]     at org/apache/hadoop/hdfs/MiniDFSCluster.<init>(MiniDFSCluster.java:277)
    [junit]     at org/apache/hadoop/hdfs/MiniDFSCluster.<init>(MiniDFSCluster.java:123)
    [junit]     at org/apache/hadoop/hdfs/TestAbandonBlock.testAbandonBlock(TestAbandonBlock.java:38)

TestCLI is taking its time for a different reason; there's a sleep in teardown()
    [junit] "Main Thread" id=1 idx=0x4 tid=11685 prio=5 alive, in native, sleeping, native_waiting
    [junit]     at java/lang/Thread.sleep(J)V(Native Method)
    [junit]     at org/apache/hadoop/cli/TestCLI.tearDown(TestCLI.java:137)

Can someone on a different machine 
1. run the tests
2. bring up top in a different window to see what is the Java process ID for a test case
3. kill -QUIT that process and see where we are blocking. 

If it is networking, it may behave differently for different users, and machines with IPv6 suffer more. 




> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618533#action_12618533 ] 

Doug Cutting commented on HADOOP-3694:
--------------------------------------

The overall test runtime was over 10% faster than most other recent successful runs (1:25 from ~1:36).  Not huge, but something...  Should we commit this?

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Fix Version/s: 0.19.0
         Assignee: Doug Cutting
           Status: Patch Available  (was: Open)

This works on my machine for the tests I've tried.  Let's see if the full set  runs on Hudson, and, if so, how much faster it runs there.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Attachment: HADOOP-3694.patch

This address is not read from the conf, but rather specfied explicitly in MiniDFSCluster.java.  I replaced 0.0.0.0 with 127.0.0.1, and TestDFSMkdirs dropped from 20 seconds to 5 seconds.

Here's a patch which makes that replacement throughout test code.


> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>         Attachments: HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12612019#action_12612019 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

Assuming this is the cause (and that same stack trace comes up, again and again), this is what the code is trying to do

DataNode.startDataNode()

    
    InetSocketAddress ipcAddr = NetUtils.createSocketAddr(     // 1
        conf.get("dfs.datanode.ipc.address"));
    String hostname = ipcAddr.getHostName();                         // 2
    ipcServer = RPC.getServer(this, hostname, ipcAddr.getPort(),  // 3
        conf.getInt("dfs.datanode.handler.count", 3), false, conf);

(1) get socket address from the dfs.datanode.ipc.address, which defaults to "0.0.0.0:50020"
(2) get the real hostname of the assigned socket
(3) open a server on this port. 


Inside NetUtils.createSocketAddr, the configuration string is parsed and the (hostname,port) values extracted. This hostname is then turned into a new address. 

1. If there is a static hostname -> hostname' mapping that is used

    if (getStaticResolution(hostname) != null) {
      hostname = getStaticResolution(hostname);
    }

2. else the OS/JVM does the work, to work out the address
    return new InetSocketAddress(hostname, port);

Somehow this is picking up an IPv6 address

Later, when ipcAddr.getHostName(); is called (in (2)), An attempt to rDNS this address is made. Unless your site is running IPv6 DNS, this isnt going to succeed, but you are going to take a 15-30s hit every time an attempt is made.

I'm going to see how to remove IPv6 from this machine, which has 1 real and two virtual interfaces as well as loopback, to see if this will make the problem go away, or at least make some mild improvements....

eth0      Link encap:Ethernet  HWaddr 00:1C:C4:17:CC:46  
          inet addr:16.XX.XX.XXX Bcast:16.XX.XX.255  Mask:255.255.252.0
          inet6 addr: fe80::21c:c4ff:fe17:cc46/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1561368 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12199689 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:325311436 (310.2 MB)  TX bytes:2108807940 (1.9 GB)
          Interrupt:17 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:2538947 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2538947 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:881374392 (840.5 MB)  TX bytes:881374392 (840.5 MB)

vmnet1    Link encap:Ethernet  HWaddr 00:50:56:C0:00:01  
          inet addr:192.168.66.1  Bcast:192.168.66.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fec0:1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1151 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

vmnet8    Link encap:Ethernet  HWaddr 00:50:56:C0:00:08  
          inet addr:192.168.142.1  Bcast:192.168.142.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fec0:8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1151 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)





> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618317#action_12618317 ] 

Steve Loughran commented on HADOOP-3694:
----------------------------------------

One test is still failing (sigh)

org.apache.hadoop.hdfs.TestFileCreation.testClientTriggeredLeaseRecovery
junit.framework.AssertionFailedError: Timeout occurred. Please note the time in the report does not reflect the time until the timeout.

I do want to apply this patch, but as I've got my own changes to MiniDFS, don't want to mix things up. One thing that I can see from looking at the patch is that the constant 0:0:0:0 is being replaced everywhere with another string constant. How about it gets moved to a class TestConstants or similar, so that henceforth there is a single place to change the hostname value?

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Attachment: HADOOP-3694.patch

Here's a new version that changes a few more tests from 0.0.0.0 to 127.0.0.1 and also shortens Thread.sleep's in a couple of long-running tests.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617638#action_12617638 ] 

Hadoop QA commented on HADOOP-3694:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12387059/HADOOP-3694.patch
  against trunk revision 679930.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 33 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2963/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2963/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2963/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2963/console

This message is automatically generated.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Status: Patch Available  (was: Open)

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Bill de hOra (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611411#action_12611411 ] 

Bill de hOra commented on HADOOP-3694:
--------------------------------------

Looks like these calls in the MiniDFS ctor

{code}
235 UnixUserGroupInformation information = UnixUserGroupInformation.login( conf );

262 NameNode.format(conf);

272 nameNode = NameNode.createNameNode(args, conf);
{code}

2008-07-08 00:29:40,034 INFO  fs.FSNamesystem (FSNamesystem.java:initialize(286)) - Finished loading FSImage in 12461 msecs

suggests maybe FSNamesystem is where to start optimising?


> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611223#action_12611223 ] 

Doug Cutting commented on HADOOP-3694:
--------------------------------------

> If we could find out why it is taking so long and cut it down [ ... ]

+1  Decreasing unit test runtime increases developer productivity!

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Status: Open  (was: Patch Available)

The failed test last night (TestFileCreation#testClientTriggeredLeaseRecovery) is unrelated.  The problem there is that it uses Thread.interrupt() to stop threads that are writing to HDFS, but that's not reliable, since DFSClient catches and ignores InterruptedException in many places (which is a bug).  So this test fails sporadically.

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618537#action_12618537 ] 

Raghu Angadi commented on HADOOP-3694:
--------------------------------------

Most of the difference because of TestDataBlockScanner.testBlockCorruptionRecoveryPolicy somehow taking nearly 10 minutes less. We should fix this test too.

+1 for committing none the less.


> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3694) if MiniDFS startup time could be improved, testing time would be reduced

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-3694:
---------------------------------

    Status: Open  (was: Patch Available)

> if MiniDFS startup time could be improved, testing time would be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3694
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3694.patch, HADOOP-3694.patch, HADOOP-3694.patch
>
>
> Its taking me 140 minutes to run a test build; looking into the test results its the 20s startup delay of every MiniDFS cluster that is slowing things down. If we could find out why it is taking so long and cut it down, every test case that relied on a cluster would be speeded up. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.