You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nutch.apache.org by "Paul Baclace (JIRA)" <ji...@apache.org> on 2005/10/19 04:16:44 UTC

[jira] Created: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

TestNDFS a JUnit test specifically for NDFS
-------------------------------------------

         Key: NUTCH-116
         URL: http://issues.apache.org/jira/browse/NUTCH-116
     Project: Nutch
        Type: Test
  Components: fetcher, indexer, searcher  
    Versions: 0.8-dev    
    Reporter: Paul Baclace


TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  

The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

Posted by "Paul Baclace (JIRA)" <ji...@apache.org>.

     [ http://issues.apache.org/jira/browse/NUTCH-116?page=all ]

Paul Baclace updated NUTCH-116:
-------------------------------

    Attachment: required_by_TestNDFS_v3.patch

I found and fixed a problem with a standalone DataNode process exiting too early (this was not detected by the current unit tests); this was because of changes in the required_by_TestNDFS patch; main() will now join() all the subthreads via runAndWait(NutchConf) and run(NutchConf) can be used to start subthreads and without waiting for them to finish.  The v3 patch has the cumulative required_by_TestNDFS changes.  

(comments_msgs_and_local_renames_during_TestNDFS.patch are still separate.)


> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>  Attachments: TestNDFS.java, TestNDFS.java, required_by_TestNDFS.patch, required_by_TestNDFS_v2.patch, required_by_TestNDFS_v3.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

Posted by "Paul Baclace (JIRA)" <ji...@apache.org>.

     [ http://issues.apache.org/jira/browse/NUTCH-116?page=all ]

Paul Baclace updated NUTCH-116:
-------------------------------

    Attachment: comments_msgs_and_local_renames_during_TestNDFS.patch

> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>  Attachments: TestNDFS.java, TestNDFS.java, comments_msgs_and_local_renames_during_TestNDFS.patch, required_by_TestNDFS.patch, required_by_TestNDFS_v2.patch, required_by_TestNDFS_v3.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

Posted by "Paul Baclace (JIRA)" <ji...@apache.org>.

     [ http://issues.apache.org/jira/browse/NUTCH-116?page=all ]

Paul Baclace updated NUTCH-116:
-------------------------------

    Attachment: required_by_TestNDFS_v2.patch

Change Notes revised for patch required_by_TestNDFS_v2.patch which supercedes required_by_TestNDFS.patch:

src/java/org/apache/nutch/ipc/Server.java
  Set thread names to make it possible to view logging output and known when proper shutdown is completed. Using the safer notifyAll() for Server.stop() and Server.join() instead of 
  notify() since the wait condition is on a public object, added comments that clarify
  actual implementation.
  Tightened the join() to be a proper while (running) { wait()} which avoids the 
  hazard of "spurious wakeup" in Posix threads (as noted in Effective Java by Joshua Bloch).

src/java/org/apache/nutch/ndfs/DataNode.java
  improved logging details, added comments, improved error message, 
  refactored reuseable code into makeInstanceForDir(), added toString(),
  added properties ndfs.blockreport.intervalMsec and ndfs.datanode.startupMsec
  to allow the override of BLOCKREPORT_INTERVAL and DATANODE_STARTUP_PERIOD,
  respectively, in order to speed up TestNDFS runs (otherwise it would take an hour).
  These FSConstant fields are worth keeping as default values when a property is
  not set so that lookup idiom is:
    conf.getLong("ndfs.datanode.startupMsec", DATANODE_STARTUP_PERIOD);
  instead of:
    conf.getLong("ndfs.datanode.startupMsec", 1000*60*10);
  When a property lookup occurs in more than one place, it is best to have the
  default value come from FSConstants rather than have multiple, possibly 
  different, literal values as the default.

src/java/org/apache/nutch/ndfs/FSDataset.java
  added toString() methods used in logging elsewhere.

src/java/org/apache/nutch/ndfs/FSNamesystem.java
  Changed chooseTarget() to behave as commented rather than as implemented (it
  says it fobids picking a target on the same host, but it was using
  host:port as the basis of comparison, so different ports on the same host
  would appear to be different hosts; this mistake was probably the result of
  DatanodeInfo.getName() really returning host:port, not just hostname which
  is what the method name implies (DatanodeInfo.getHost() removes the port number).  
  Added property test.ndfs.same.host.targets.allowed which allows target datanode
  selection to use same host (same host:port is never allowed.)
  TestNDFS uses host:port comparison
  and normal operation just uses 'host' to better distribute replicants;
  simplified a chooseTarget() conditional which was redundantly
  checking against forbidden1, forbidden2 and the just constructed
  forbiddenMachines containing the union of forbidden1, forbidden2:

          if ((forbidden1 == null || ! forbidden1.contains(node)) &&
              (forbidden2 == null || ! forbidden2.contains(node)) &&
              (! forbiddenMachines.contains(node.getName()))) {
   
  The following:
     forbidden1.contains(node) == forbiddenMachines.contains(node.getName()) 
  is always true and uses host:port for the comparison.

  Added logging for previously
  silent errors, emit more info for some logging, change LOG.info() to 
  LOG.warning(), added javadoc comments, 

src/java/org/apache/nutch/ndfs/NameNode.java
  Added a way to stop the daemon for JUnit testing, added javadoc comments, 
  renames offerService() to join() to better indicate what the method 
  really does, added property ndfs.namenode.handler.count to adjust the
  number of handlers to speed up testing, changed access of some fields 
  from package to private (protected is also reasonable) to quickly indicate 
  how it is self-contained when studying the code.


> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>  Attachments: TestNDFS.java, required_by_TestNDFS.patch, required_by_TestNDFS_v2.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Resolved: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

     [ http://issues.apache.org/jira/browse/NUTCH-116?page=all ]
     
Doug Cutting resolved NUTCH-116:
--------------------------------

    Fix Version: 0.8-dev
     Resolution: Fixed

I just committed this.  Thanks, Paul, this is great to have!

> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>      Fix For: 0.8-dev
>  Attachments: TestNDFS.java, TestNDFS.java, comments_msgs_and_local_renames_during_TestNDFS.patch, required_by_TestNDFS.patch, required_by_TestNDFS_v2.patch, required_by_TestNDFS_v3.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ http://issues.apache.org/jira/browse/NUTCH-116?page=comments#action_12332493 ] 

Doug Cutting commented on NUTCH-116:
------------------------------------

Paul,

This looks like good stuff.

I could commit it more easily if changes were restricted to those required by TestNDFS.  Changes to comments, documentation, logging, etc. are better contributed as separate patches.  It's also okay to submit a unit test that fails and then to submit fixes as separate patches.  That makes my job easier: I can first see that the unit test looks reasonable, then see that it fails, then see how the patch fixes it.  As it stands it will take me some time to fully evalute this patch.

A few quick comments: 

If BLOCKREPORT_INTERVAL and DATANODE_STARTUP_PERIOD may be overridden (as is reasonable, and perhaps required by TestNDFS) then perhaps they should be removed from FSConstants entirely.  Does that make sense?

In Server.java, why is notifyAll() safer than notify()?  The intent is to wake one and only one waiting Handler thread.  notifyAll() would cause all of the Handler threads to become runnable even when only a single call has arrived.  Is this required by TestNDFS?

Thanks,

Doug

> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>  Attachments: TestNDFS.java, required_by_TestNDFS.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

Posted by "Paul Baclace (JIRA)" <ji...@apache.org>.

     [ http://issues.apache.org/jira/browse/NUTCH-116?page=all ]

Paul Baclace updated NUTCH-116:
-------------------------------

    Attachment: required_by_TestNDFS.patch
                TestNDFS.java

Patch comments:  

src/java/org/apache/nutch/ipc/Server.java
  improved logging details, use the safer notifyAll() instead of notify(), added comments.

src/java/org/apache/nutch/ndfs/DataNode.java
  improved logging details, added comments, improved error message, 
  factored reuseable code into makeInstanceForDir(), added toString(),
  added properties ndfs.blockreport.intervalMsec and ndfs.datanode.startupMsec
  to allow the override of BLOCKREPORT_INTERVAL and DATANODE_STARTUP_PERIOD,
  respectively.

src/java/org/apache/nutch/ndfs/FSDataset.java
  added toString() methods.

src/java/org/apache/nutch/ndfs/FSNamesystem.java
  Changed chooseTarget() to behave as commented rather than as implemented (it
  says it fobids picking a target on the same host, but it was using
  host:port as the basis of comparison, TestNDFS needs host:port comparison,
  and normal operation just uses 'host' to better distribute replicants;
  simplified redundant conditional,
  added property test.ndfs.same.host.targets.allowed which allows target datanode
  selection to use same host (but not same port), added logging for previously
  silent errors, emit more info for some logging, change LOG.info() to 
  LOG.warning(), added javadoc comments.

src/java/org/apache/nutch/ndfs/NameNode.java
  Added a way to stop the daemon for JUnit testing, added javadoc comments.


> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>  Attachments: TestNDFS.java, required_by_TestNDFS.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

Posted by "Paul Baclace (JIRA)" <ji...@apache.org>.

    [ http://issues.apache.org/jira/browse/NUTCH-116?page=comments#action_12332546 ] 

Paul Baclace commented on NUTCH-116:
------------------------------------

Doug,

Thanks for the quick response.  

1. Should BLOCKREPORT_INTERVAL and DATANODE_STARTUP_PERIOD  be removed from FSConstants entirely?

I see two choices here:  (1) have these constants remain as defaults used when override properties are not used (the normal case for these settings) or (2) remove them and have the defaults be literal constants in statements scattered around.  My preference is (1) because (2) allows for multiple, different literal constant defaults when a setting is retrieved in more than one code block.

2.A. Regarding Server.java and switching all notify() to notifyAll():  I'm glad you spotted that.  I was debugging a non-liveliness problem and should have circled back and justified the dogmatic use of notifyAll().  Looking at it again, two of them are waiting on a private object and that is perfect candidate for notify().  The wait and notify between join() and stop() should be notifyAll() because it is waiting on a non-private object.  I revised this and tightened the join() to be a proper while (running) { wait()} which avoids the hazard of "spurious wakeup" in Posix threads (as noted in Effective Java by Joshua Bloch).

2.B. I eliminated a change that removed ' synchronized (FSNamesystem.this) ' from 
LeaseMonitor thinking it was unnecessary.  Looking at it again, I am not sure why
is is necessary to synchronize on the enclosing scope FSNamesystem.this, but it would
ensure that many file operations could not begin until the lease renewals were finished.
It seems that method synchronization was used throughout for conservative safety.

3. "Changes to comments, documentation, logging, etc. are better contributed as separate patches".  I separated out changes not strictly required for TestNDFS in an earlier patch, but I did not attempt to do this at a granularity below the file level when these human-understanding changes co-occur with ones that are functionally necessary because it is unwieldy to separate out the edits, and test separately, etc.   Additionally, some logging changes like giving numbers to Server Handler Thread names make it much easier to determine when a daemon is properly shutdown.

Paul


> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>  Attachments: TestNDFS.java, required_by_TestNDFS.patch, required_by_TestNDFS_v2.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS

Posted by "Paul Baclace (JIRA)" <ji...@apache.org>.

     [ http://issues.apache.org/jira/browse/NUTCH-116?page=all ]

Paul Baclace updated NUTCH-116:
-------------------------------

    Attachment: TestNDFS.java

Revised  TestNDFS to add a log message about which random number generator is in use (also changed the fixed seed to a newly created one instead of the seed used by a different MersenneTwister with a different seed init, and  minor comment changes).


> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>  Attachments: TestNDFS.java, TestNDFS.java, required_by_TestNDFS.patch, required_by_TestNDFS_v2.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more strictly, pseudo distributed) meaning all daemons run in one process and sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and number of datanodes.  After creating 1 or more files and filling them with random data, one datanode is shutdown, and then the files are verfified. Next, all the random test files are deleted and we test for leakage (non-deletion) by directly checking the real directories corresponding to the datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira