You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "raghavendra prabhu (JIRA)" <ji...@apache.org> on 2005/12/23 17:12:30 UTC

[jira] Commented: (NUTCH-147) nutch map reduce does not work in windows map reduce runs in a loop

    [ http://issues.apache.org/jira/browse/NUTCH-147?page=comments#action_12361198 ] 

raghavendra prabhu commented on NUTCH-147:
------------------------------------------

Is this issue because you need cygwin to run the crawl on windows

The version 0.7.1 had no such dependencies.

Can anyone conform????

> nutch map reduce does not work in windows map reduce runs in a loop
> -------------------------------------------------------------------
>
>          Key: NUTCH-147
>          URL: http://issues.apache.org/jira/browse/NUTCH-147
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: Windows system Winxp Pro
>     Reporter: raghavendra prabhu
>     Priority: Blocker

>
> Description 
> Crawl Starts 
> and i am able to see the initial messages
> Then the map reduce process starts and it continues to run in a loop 
> I do not find the same problem in linux(linux it works perfectly)
> Below is loop into which i run into 
> clustering.OnlineClusterer)
> 051222 182058   Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
> 051222 182058   Nutch Content Parser (org.apache.nutch.parse.Parser)
> 051222 182058   Ontology Model Loader (org.apache.nutch.ontology.Ontology)
> 051222 182058   Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer)
> 051222 182058   Nutch Query Filter (org.apache.nutch.searcher.QueryFilter)
> 051222 182058 found resource crawl-urlfilter.txt at file:/G:/trunklatest/conf/cr
> awl-urlfilter.txt
> 051222 182058 crawl\url.txt:0+25
> 051222 182059 crawl\url.txt:0+25
> 051222 182059  map -521216%
> 051222 182100 crawl\url.txt:0+25
> 051222 182100  map -1107496%
> 051222 182101 crawl\url.txt:0+25
> 051222 182101  map -1678544%
> 051222 182102 crawl\url.txt:0+25
> 051222 182102  map -2265900%
> 051222 182103 crawl\url.txt:0+25
> 051222 182103  map -2849416%
> 051222 182104 crawl\url.txt:0+25
> 051222 182104  map -3422908%
> 051222 182105 crawl\url.txt:0+25
> The same thing continues

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira