You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "raghavendra prabhu (JIRA)" <ji...@apache.org> on 2005/12/22 04:45:30 UTC
[jira] Created: (NUTCH-147) nutch map reduce does not work in windows map reduce runs in a loop
nutch map reduce does not work in windows map reduce runs in a loop
-------------------------------------------------------------------
Key: NUTCH-147
URL: http://issues.apache.org/jira/browse/NUTCH-147
Project: Nutch
Type: Bug
Components: indexer
Versions: 0.8-dev
Environment: Windows system Winxp Pro
Reporter: raghavendra prabhu
Priority: Blocker
Description
Crawl Starts
and i am able to see the initial messages
Then the map reduce process starts and it continues to run in a loop
I do not find the same problem in linux(linux it works perfectly)
Below is loop into which i run into
clustering.OnlineClusterer)
051222 182058 Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
051222 182058 Nutch Content Parser (org.apache.nutch.parse.Parser)
051222 182058 Ontology Model Loader (org.apache.nutch.ontology.Ontology)
051222 182058 Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer)
051222 182058 Nutch Query Filter (org.apache.nutch.searcher.QueryFilter)
051222 182058 found resource crawl-urlfilter.txt at file:/G:/trunklatest/conf/cr
awl-urlfilter.txt
051222 182058 crawl\url.txt:0+25
051222 182059 crawl\url.txt:0+25
051222 182059 map -521216%
051222 182100 crawl\url.txt:0+25
051222 182100 map -1107496%
051222 182101 crawl\url.txt:0+25
051222 182101 map -1678544%
051222 182102 crawl\url.txt:0+25
051222 182102 map -2265900%
051222 182103 crawl\url.txt:0+25
051222 182103 map -2849416%
051222 182104 crawl\url.txt:0+25
051222 182104 map -3422908%
051222 182105 crawl\url.txt:0+25
The same thing continues
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Closed: (NUTCH-147) nutch map reduce does not work in windows map reduce runs in a loop
Posted by "Piotr Kosiorowski (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-147?page=all ]
Piotr Kosiorowski closed NUTCH-147:
-----------------------------------
Resolution: Invalid
cygwin requirement on Windows is listed in nutch tutorial. Please reopen if problems persists after using it from cygwin environment.
> nutch map reduce does not work in windows map reduce runs in a loop
> -------------------------------------------------------------------
>
> Key: NUTCH-147
> URL: http://issues.apache.org/jira/browse/NUTCH-147
> Project: Nutch
> Type: Bug
> Components: indexer
> Versions: 0.8-dev
> Environment: Windows system Winxp Pro
> Reporter: raghavendra prabhu
> Priority: Blocker
>
> Description
> Crawl Starts
> and i am able to see the initial messages
> Then the map reduce process starts and it continues to run in a loop
> I do not find the same problem in linux(linux it works perfectly)
> Below is loop into which i run into
> clustering.OnlineClusterer)
> 051222 182058 Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
> 051222 182058 Nutch Content Parser (org.apache.nutch.parse.Parser)
> 051222 182058 Ontology Model Loader (org.apache.nutch.ontology.Ontology)
> 051222 182058 Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer)
> 051222 182058 Nutch Query Filter (org.apache.nutch.searcher.QueryFilter)
> 051222 182058 found resource crawl-urlfilter.txt at file:/G:/trunklatest/conf/cr
> awl-urlfilter.txt
> 051222 182058 crawl\url.txt:0+25
> 051222 182059 crawl\url.txt:0+25
> 051222 182059 map -521216%
> 051222 182100 crawl\url.txt:0+25
> 051222 182100 map -1107496%
> 051222 182101 crawl\url.txt:0+25
> 051222 182101 map -1678544%
> 051222 182102 crawl\url.txt:0+25
> 051222 182102 map -2265900%
> 051222 182103 crawl\url.txt:0+25
> 051222 182103 map -2849416%
> 051222 182104 crawl\url.txt:0+25
> 051222 182104 map -3422908%
> 051222 182105 crawl\url.txt:0+25
> The same thing continues
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (NUTCH-147) nutch map reduce does not work in windows map reduce runs in a loop
Posted by "raghavendra prabhu (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-147?page=comments#action_12361198 ]
raghavendra prabhu commented on NUTCH-147:
------------------------------------------
Is this issue because you need cygwin to run the crawl on windows
The version 0.7.1 had no such dependencies.
Can anyone conform????
> nutch map reduce does not work in windows map reduce runs in a loop
> -------------------------------------------------------------------
>
> Key: NUTCH-147
> URL: http://issues.apache.org/jira/browse/NUTCH-147
> Project: Nutch
> Type: Bug
> Components: indexer
> Versions: 0.8-dev
> Environment: Windows system Winxp Pro
> Reporter: raghavendra prabhu
> Priority: Blocker
>
> Description
> Crawl Starts
> and i am able to see the initial messages
> Then the map reduce process starts and it continues to run in a loop
> I do not find the same problem in linux(linux it works perfectly)
> Below is loop into which i run into
> clustering.OnlineClusterer)
> 051222 182058 Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
> 051222 182058 Nutch Content Parser (org.apache.nutch.parse.Parser)
> 051222 182058 Ontology Model Loader (org.apache.nutch.ontology.Ontology)
> 051222 182058 Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer)
> 051222 182058 Nutch Query Filter (org.apache.nutch.searcher.QueryFilter)
> 051222 182058 found resource crawl-urlfilter.txt at file:/G:/trunklatest/conf/cr
> awl-urlfilter.txt
> 051222 182058 crawl\url.txt:0+25
> 051222 182059 crawl\url.txt:0+25
> 051222 182059 map -521216%
> 051222 182100 crawl\url.txt:0+25
> 051222 182100 map -1107496%
> 051222 182101 crawl\url.txt:0+25
> 051222 182101 map -1678544%
> 051222 182102 crawl\url.txt:0+25
> 051222 182102 map -2265900%
> 051222 182103 crawl\url.txt:0+25
> 051222 182103 map -2849416%
> 051222 182104 crawl\url.txt:0+25
> 051222 182104 map -3422908%
> 051222 182105 crawl\url.txt:0+25
> The same thing continues
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira