You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Aaron Nall (JIRA)" <ji...@apache.org> on 2008/07/28 21:33:31 UTC
[jira] Created: (NUTCH-638) Launching Distributed Searchers with
URI indicating filesystem to use rather than relying on hadoop config
files.
Launching Distributed Searchers with URI indicating filesystem to use rather than relying on hadoop config files.
-----------------------------------------------------------------------------------------------------------------
Key: NUTCH-638
URL: https://issues.apache.org/jira/browse/NUTCH-638
Project: Nutch
Issue Type: Improvement
Components: searcher
Affects Versions: 1.0.0
Reporter: Aaron Nall
Priority: Minor
I wanted to conduct all index creation operations in hdfs but search from the local file system using the same cluster of machines. I believe that this is a common use case.
This required either a parallel nutch install or edits (scripted or manual) to hadoop-site.xml to change the file system from hdfs to local when starting a distributed searcher service. This minor patch makes IndexSearcher and NutchBean honor URIs as supported by hadoop 0.17 without altering existing functionality if simple paths are entered.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-638) Launching Distributed Searchers with
URI indicating filesystem to use rather than relying on hadoop config
files.
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639058#action_12639058 ]
Andrzej Bialecki commented on NUTCH-638:
-----------------------------------------
I think in NutchBean.java we can also use dir.getFileSystem(conf) instead of FileSystem.get(dir.toUri(), this.conf). Could you please test if this works for you? Other than that the patch looks fine.
> Launching Distributed Searchers with URI indicating filesystem to use rather than relying on hadoop config files.
> -----------------------------------------------------------------------------------------------------------------
>
> Key: NUTCH-638
> URL: https://issues.apache.org/jira/browse/NUTCH-638
> Project: Nutch
> Issue Type: Improvement
> Components: searcher
> Affects Versions: 1.0.0
> Reporter: Aaron Nall
> Priority: Minor
> Attachments: distributed-search-uri.patch
>
> Original Estimate: 0.25h
> Remaining Estimate: 0.25h
>
> I wanted to conduct all index creation operations in hdfs but search from the local file system using the same cluster of machines. I believe that this is a common use case.
> This required either a parallel nutch install or edits (scripted or manual) to hadoop-site.xml to change the file system from hdfs to local when starting a distributed searcher service. This minor patch makes IndexSearcher and NutchBean honor URIs as supported by hadoop 0.17 without altering existing functionality if simple paths are entered.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-638) Launching Distributed Searchers with
URI indicating filesystem to use rather than relying on hadoop config
files.
Posted by "Aaron Nall (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aaron Nall updated NUTCH-638:
-----------------------------
Attachment: distributed-search-uri.patch
This is the patch that I used to address the issue.
> Launching Distributed Searchers with URI indicating filesystem to use rather than relying on hadoop config files.
> -----------------------------------------------------------------------------------------------------------------
>
> Key: NUTCH-638
> URL: https://issues.apache.org/jira/browse/NUTCH-638
> Project: Nutch
> Issue Type: Improvement
> Components: searcher
> Affects Versions: 1.0.0
> Reporter: Aaron Nall
> Priority: Minor
> Attachments: distributed-search-uri.patch
>
> Original Estimate: 0.25h
> Remaining Estimate: 0.25h
>
> I wanted to conduct all index creation operations in hdfs but search from the local file system using the same cluster of machines. I believe that this is a common use case.
> This required either a parallel nutch install or edits (scripted or manual) to hadoop-site.xml to change the file system from hdfs to local when starting a distributed searcher service. This minor patch makes IndexSearcher and NutchBean honor URIs as supported by hadoop 0.17 without altering existing functionality if simple paths are entered.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.