You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/04/25 23:22:18 UTC

[jira] [Updated] (NUTCH-1104) Port issues from trunk NutchGora branch

     [ https://issues.apache.org/jira/browse/NUTCH-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lewis John McGibbney updated NUTCH-1104:
----------------------------------------

    Fix Version/s:     (was: nutchgora)
                   2.1

Set and Classify
                
> Port issues from trunk NutchGora branch
> ---------------------------------------
>
>                 Key: NUTCH-1104
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1104
>             Project: Nutch
>          Issue Type: Task
>    Affects Versions: nutchgora
>            Reporter: Markus Jelsma
>             Fix For: 2.1
>
>
> Umbrella issue for tracking issues that should be ported from 1.x trunk to the NutchGora branch. Please mark ported issues by modifying this description.
> NOT YET PORTED:
> * NUTCH-809 Parse-metatags plugin
> * NUTCH-987 Support HTTP auth for Solr communication
> * NUTCH-1028 Log parser keys
> * NUTCH-1036 Solr jobs should increment counters in Reporter
> * NUTCH-1057 Make fetcher thread time out configurable
> * NUTCH-1067 Configure minimum throughput for fetcher
> * NUTCH-1101 Options to purge db_gone records in updatedb
> * NUTCH-1102 Fetcher, rely on fetcher.parse directive only
> * NUTCH-1105 MaxContentLength option for index-basic
> * NUTCH-940 Statis field plugin
> * NUTCH-1094 create comprehensive documentation for Nutch 2.0 trunk
> * NUTCH-1207 ParserChecker to output signature
> * NUTCH-1090 InvertLinks should inform when ignoring internal links
> * NUTCH-1174 Outlinks are not properly normalized
> * NUTCH-1203 ParseSegment to show number of milliseconds per parse
> * NUTCH-1173 DomainStats doesn't count db_not_modified
> * NUTCH-1155 Host/domain limit in generator is generate.max.count+1
> * NUTCH-1061 Migrate MoreIndexingFilter from Apache ORO to java.util.regex
> * NUTCH-1142 Normalization and filtering in WebGraph
> * NUTCH-1153 LinkRank not to log all keys and not to write Hadoop _SUCCESS file
> * NUTCH-1195 Add Solr 4x (trunk) example schema
> * NUTCH-1141 Configurable Fetcher queue depth
> * NUTCH-1214 DomainStats tool should be named for what it's doing
> * NUTCH-1213 Pass additional SolrParams when indexing to Solr
> * NUTCH-1211 URLFilterChecker command line help doesn't inform user of STDIN requirements
> * NUTCH-1231 Upgrade to Tika 1.0
> * NUTCH-1230 MimeType API deprecated and breaks with Tika 1.0
> * NUTCH-1235 Upgrade to new Hadoop 0.20.205.0
> * NUTCH-1184 Fetcher to parse and follow Nth degree outlinks
> * NUTCH-1214 DomainStats tool should be named for what it's doing
> * NUTCH-1207 ParserChecker to output signature
> * NUTCH-1174 Outlinks are not properly normalized
> * NUTCH-1173 DomainStats doesn't count db_not_modified
> * NUTCH-1142 Normalization and filtering in WebGraph
> PORTED:
> * No issues yet
> NOT GOING TO BE PORTED:
> * No issues, explain why it should not be ported

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira