You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Pham Tuan Minh (JIRA)" <ji...@apache.org> on 2010/07/14 20:19:50 UTC

[jira] Commented: (NUTCH-854) Define standard attributes with values and explaination to configuration files in conf directory

    [ https://issues.apache.org/jira/browse/NUTCH-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888469#action_12888469 ] 

Pham Tuan Minh commented on NUTCH-854:
--------------------------------------

My idea is we define standard attributes for nutch work. In case user want some customizations in crawling their web site (data source), they will define their attributes in nutch-site.xml to override.

> Define standard attributes with values and explaination to configuration files in conf directory
> ------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-854
>                 URL: https://issues.apache.org/jira/browse/NUTCH-854
>             Project: Nutch
>          Issue Type: Improvement
>         Environment: Window XP SP3, Cygwin, JDK 1.6.20, Ant 1.8.1
>            Reporter: Pham Tuan Minh
>             Fix For: 2.0
>
>
> It would make nutch easier to use if all configuration file in conf directory is defined standard attributes with values and explanation. For example, currently nutch-site.xml.template contains no attributes and no explanation, we should define them.
> -------------
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <!-- site-specific property overrides in this file. -->
> <configuration>
> <!-- Agent name-->
> <property>
> <name>http.agent.name</name>
> <value>nutch-solr-integration</value>
> </property>
> <!---->
> <property>
> <name>generate.max.per.host</name>
> <value>100</value>
> </property>
> <property>
> <!-- plug-in using in this site -->
> <name>plugin.includes</name>
> <value>protocol-http|urlfilter-regex|parse-tika|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
> </property>
> </configuration>
> -------------
> Thanks,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.