You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:00:06 UTC

[jira] [Closed] (NUTCH-478) Add function for stopping FetherThread gracefully

     [ https://issues.apache.org/jira/browse/NUTCH-478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma closed NUTCH-478.
-------------------------------

    Resolution: Won't Fix

The fetcher can be interupted and resumed in 2.0, making this work in 1.x would be very hard. 

> Add function for stopping FetherThread gracefully
> -------------------------------------------------
>
>                 Key: NUTCH-478
>                 URL: https://issues.apache.org/jira/browse/NUTCH-478
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>    Affects Versions: 0.9.0
>            Reporter: chee.wu
>
> Now the fetch process will be  stopped only when  time out occurred during the fetch:
> "System.currentTimeMillis() - lastRequestStart.get()) > timeout "
> We don't have method to let fetch process to stop.Some times we may have strict time requirement for fetch process, for example from 11pm to 7am.I want to shutdown fetch process at 7am every day even there  still have pages remained unfeched in the segments generated.
> A possible solution to implement this might be:
> 1. User create a file named "FetchStop" in nutch home.
> 2. Check the existence of the file every minute in the main thread,and set the boolean variable like "stopFetch" to true;
> 3. FetchThread will check  the status of "stopFetch" before fetching next URL. If changed to true, FetcherThread will stop right now,also the value of activeThreads will be reduced.
> 4. Finally, the main thread will end if  activeThreads=0
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira