You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Pham Tuan Minh (JIRA)" <ji...@apache.org> on 2010/07/14 20:01:52 UTC

[jira] Created: (NUTCH-854) Define standard attributes with values and explaination to configuration file in conf directory

Define standard attributes with values and explaination to configuration file in conf directory
-----------------------------------------------------------------------------------------------

                 Key: NUTCH-854
                 URL: https://issues.apache.org/jira/browse/NUTCH-854
             Project: Nutch
          Issue Type: Improvement
         Environment: Window XP SP3, Cygwin, JDK 1.6.20, Ant 1.8.1
            Reporter: Pham Tuan Minh
             Fix For: 2.0


It would make nutch easier to use if all configuration file in conf directory is defined standard attributes with values and explanation. For example, currently nutch-site.xml.template contains no attributes and no explanation, we should define them.

-------------
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- site-specific property overrides in this file. -->

<configuration>

<!-- Agent name-->
<property>
<name>http.agent.name</name>
<value>nutch-solr-integration</value>
</property>

<!---->
<property>
<name>generate.max.per.host</name>
<value>100</value>
</property>
<property>

<!-- plug-in using in this site -->
<name>plugin.includes</name>
<value>protocol-http|urlfilter-regex|parse-tika|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
</property>
</configuration>
-------------

Thanks,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-854) Define standard attributes with values and explaination to configuration files in conf directory

Posted by "Pham Tuan Minh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888469#action_12888469 ] 

Pham Tuan Minh commented on NUTCH-854:
--------------------------------------

My idea is we define standard attributes for nutch work. In case user want some customizations in crawling their web site (data source), they will define their attributes in nutch-site.xml to override.

> Define standard attributes with values and explaination to configuration files in conf directory
> ------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-854
>                 URL: https://issues.apache.org/jira/browse/NUTCH-854
>             Project: Nutch
>          Issue Type: Improvement
>         Environment: Window XP SP3, Cygwin, JDK 1.6.20, Ant 1.8.1
>            Reporter: Pham Tuan Minh
>             Fix For: 2.0
>
>
> It would make nutch easier to use if all configuration file in conf directory is defined standard attributes with values and explanation. For example, currently nutch-site.xml.template contains no attributes and no explanation, we should define them.
> -------------
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <!-- site-specific property overrides in this file. -->
> <configuration>
> <!-- Agent name-->
> <property>
> <name>http.agent.name</name>
> <value>nutch-solr-integration</value>
> </property>
> <!---->
> <property>
> <name>generate.max.per.host</name>
> <value>100</value>
> </property>
> <property>
> <!-- plug-in using in this site -->
> <name>plugin.includes</name>
> <value>protocol-http|urlfilter-regex|parse-tika|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
> </property>
> </configuration>
> -------------
> Thanks,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-854) Define standard attributes with values and explaination to configuration files in conf directory

Posted by "Pham Tuan Minh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pham Tuan Minh updated NUTCH-854:
---------------------------------

    Summary: Define standard attributes with values and explaination to configuration files in conf directory  (was: Define standard attributes with values and explaination to configuration file in conf directory)

> Define standard attributes with values and explaination to configuration files in conf directory
> ------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-854
>                 URL: https://issues.apache.org/jira/browse/NUTCH-854
>             Project: Nutch
>          Issue Type: Improvement
>         Environment: Window XP SP3, Cygwin, JDK 1.6.20, Ant 1.8.1
>            Reporter: Pham Tuan Minh
>             Fix For: 2.0
>
>
> It would make nutch easier to use if all configuration file in conf directory is defined standard attributes with values and explanation. For example, currently nutch-site.xml.template contains no attributes and no explanation, we should define them.
> -------------
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <!-- site-specific property overrides in this file. -->
> <configuration>
> <!-- Agent name-->
> <property>
> <name>http.agent.name</name>
> <value>nutch-solr-integration</value>
> </property>
> <!---->
> <property>
> <name>generate.max.per.host</name>
> <value>100</value>
> </property>
> <property>
> <!-- plug-in using in this site -->
> <name>plugin.includes</name>
> <value>protocol-http|urlfilter-regex|parse-tika|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
> </property>
> </configuration>
> -------------
> Thanks,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (NUTCH-854) Define standard attributes with values and explaination to configuration files in conf directory

Posted by "Julien Nioche (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Nioche resolved NUTCH-854.
---------------------------------

    Resolution: Not A Problem

nutch-default.xml already does that : all the parameters are listed and commented along with their default values.

> Define standard attributes with values and explaination to configuration files in conf directory
> ------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-854
>                 URL: https://issues.apache.org/jira/browse/NUTCH-854
>             Project: Nutch
>          Issue Type: Improvement
>         Environment: Window XP SP3, Cygwin, JDK 1.6.20, Ant 1.8.1
>            Reporter: Pham Tuan Minh
>             Fix For: 2.0
>
>
> It would make nutch easier to use if all configuration file in conf directory is defined standard attributes with values and explanation. For example, currently nutch-site.xml.template contains no attributes and no explanation, we should define them.
> -------------
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <!-- site-specific property overrides in this file. -->
> <configuration>
> <!-- Agent name-->
> <property>
> <name>http.agent.name</name>
> <value>nutch-solr-integration</value>
> </property>
> <!---->
> <property>
> <name>generate.max.per.host</name>
> <value>100</value>
> </property>
> <property>
> <!-- plug-in using in this site -->
> <name>plugin.includes</name>
> <value>protocol-http|urlfilter-regex|parse-tika|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
> </property>
> </configuration>
> -------------
> Thanks,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.