You are viewing a plain text version of this content. The canonical link for it is here.
- Jenkins build is back to normal : Nutch-nutchgora #117 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/01 05:15:24 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1711 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/01 05:16:11 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1210) DomainBlacklistFilter - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/02 11:38:30 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1239) Webgraph should remove deleted pages from segment input - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/02 12:44:30 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1240) Domain blacklist URL filter - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/02 12:54:31 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1212) ParseOutputFormat has redundant code - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/02 12:58:30 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1017) Exception getting mime type by name - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/02 13:00:31 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1041) Not reading mime-type correctly - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/02 13:00:32 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1064) o.a.n.util.MimeUtil uses deprecated Tika methods - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/02 13:02:30 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1106) Options to skip url's based on length - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/02 13:08:30 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1106) Options to skip url's based on length - posted by "Markus Jelsma (Closed) (JIRA)" <ji...@apache.org> on 2012/01/02 13:08:30 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1232) Remove host|site fields from index-basic - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/02 13:12:30 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1138) remove LogUtil from trunk and nutch gora - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/02 13:12:30 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1239) Webgraph should remove deleted pages from segment input - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/02 14:12:30 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1232) Remove host field from index-basic - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/02 14:14:30 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1232) Remove host field from index-basic - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/02 14:18:30 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1232) Remove host field from index-basic - posted by "Hudson (Commented) (JIRA)" <ji...@apache.org> on 2012/01/02 15:06:30 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1713 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/03 05:10:52 UTC, 0 replies.
- What to do with items for which is no parser? - posted by Markus Jelsma <ma...@openindex.io> on 2012/01/03 18:18:26 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #1714 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/04 05:20:00 UTC, 16 replies.
- [jira] [Created] (NUTCH-1241) CrawlDBScanner should also be able to find records - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/04 08:59:38 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1241) CrawlDBScanner should also be able to find records - posted by "Julien Nioche (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/04 10:11:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1241) CrawlDBScanner should also be able to find records - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/04 10:39:38 UTC, 4 replies.
- [jira] [Created] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment - posted by "Edward Drapkin (Created) (JIRA)" <ji...@apache.org> on 2012/01/04 23:31:40 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment - posted by "Edward Drapkin (Updated) (JIRA)" <ji...@apache.org> on 2012/01/04 23:33:39 UTC, 6 replies.
- [jira] [Commented] (NUTCH-1220) Upgrade Solr deps - posted by "X Yang (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 01:08:42 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1220) Upgrade Solr deps - posted by "X Yang (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2012/01/05 01:14:39 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1715 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/05 05:17:17 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1146) Get rid of _success files in webgraph code - posted by "Julien Nioche (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/05 12:07:39 UTC, 0 replies.
- [jira] [Created] (NUTCH-1243) Junit jar removed from lib - posted by "Julien Nioche (Created) (JIRA)" <ji...@apache.org> on 2012/01/05 12:31:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1243) Junit jar removed from lib - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2012/01/05 12:41:41 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1146) Get rid of _success files in webgraph code - posted by "Hudson (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 13:03:39 UTC, 1 replies.
- Jenkins build is back to normal : Nutch-trunk #1716 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/05 13:48:19 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1243) Junit jar removed from lib - posted by "Hudson (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 13:49:41 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1237) Improve javac arguements for more verbose output - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 14:11:39 UTC, 8 replies.
- [jira] [Created] (NUTCH-1244) CrawlDBDumper to filter by regex - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/05 15:11:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1244) CrawlDBDumper to filter by regex - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/05 15:15:39 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1244) CrawlDBDumper to filter by regex - posted by "Julien Nioche (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 15:47:39 UTC, 6 replies.
- [jira] [Closed] (NUTCH-1237) Improve javac arguements for more verbose output - posted by "Lewis John McGibbney (Closed) (JIRA)" <ji...@apache.org> on 2012/01/05 16:05:39 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1237) Improve javac arguements for more verbose output - posted by "Lewis John McGibbney (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/05 16:05:39 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1236) Add link to site documentation to download older versions of Nutch. - posted by "Lewis John McGibbney (Closed) (JIRA)" <ji...@apache.org> on 2012/01/05 16:07:39 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1236) Add link to site documentation to download older versions of Nutch. - posted by "Lewis John McGibbney (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/05 16:07:39 UTC, 0 replies.
- [jira] [Created] (NUTCH-1245) URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again - posted by "Sebastian Nagel (Created) (JIRA)" <ji...@apache.org> on 2012/01/05 17:23:40 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1245) URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 17:49:39 UTC, 2 replies.
- [jira] [Commented] (NUTCH-827) HTTP POST Authentication - posted by "Ian Piper (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 18:11:39 UTC, 0 replies.
- [jira] [Commented] (NUTCH-926) Nutch follows wrong url in - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 18:21:40 UTC, 0 replies.
-
[jira] [Commented] (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 18:26:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-827) HTTP POST Authentication - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/06 11:29:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1245) URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/06 11:29:39 UTC, 1 replies.
- Build failed in Jenkins: Nutch-nutchgora #124 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/08 05:16:14 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-nutchgora #125 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/09 05:11:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-840) Port tests from parse-html to parse-tika - posted by "Lewis John McGibbney (Updated) (JIRA)" <ji...@apache.org> on 2012/01/09 15:52:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1139) Indexer to delete documents - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/09 16:50:40 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1244) CrawlDBDumper to filter by regex - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/09 17:02:40 UTC, 0 replies.
- edit wiki? - posted by Markus Jelsma <ma...@openindex.io> on 2012/01/09 17:04:54 UTC, 2 replies.
- [Nutch Wiki] Trivial Update of "AdminGroup" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2012/01/09 17:10:20 UTC, 2 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch_readdb" by MarkusJelsma - posted by Apache Wiki <wi...@apache.org> on 2012/01/09 17:14:15 UTC, 1 replies.
- [Nutch Wiki] Update of "bin/nutch_readdb" by MarkusJelsma - posted by Apache Wiki <wi...@apache.org> on 2012/01/09 17:15:40 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch solrindex" by MarkusJelsma - posted by Apache Wiki <wi...@apache.org> on 2012/01/09 17:19:01 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1138) remove LogUtil from trunk and nutch gora - posted by "Lewis John McGibbney (Updated) (JIRA)" <ji...@apache.org> on 2012/01/09 23:01:40 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #126 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/10 06:25:59 UTC, 1 replies.
- Jenkins build is back to normal : Nutch-trunk #1721 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/10 06:37:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1139) Indexer to delete documents - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/10 11:08:38 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1139) Indexer to delete documents - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/10 14:58:39 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #127 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/11 05:15:58 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #128 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/11 18:34:14 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1138) remove LogUtil from trunk and nutch gora - posted by "Lewis John McGibbney (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/11 18:35:39 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1138) remove LogUtil from trunk and nutch gora - posted by "Lewis John McGibbney (Closed) (JIRA)" <ji...@apache.org> on 2012/01/11 18:37:39 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-nutchgora #129 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/11 19:20:59 UTC, 0 replies.
- [jira] [Updated] (NUTCH-965) Skip parsing for truncated documents - posted by "Lewis John McGibbney (Updated) (JIRA)" <ji...@apache.org> on 2012/01/11 20:59:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1189) add commented out default settings to gora.properties files - posted by "Lewis John McGibbney (Updated) (JIRA)" <ji...@apache.org> on 2012/01/11 21:31:40 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1189) add commented out default settings to gora.properties files - posted by "Lewis John McGibbney (Closed) (JIRA)" <ji...@apache.org> on 2012/01/11 21:33:40 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1189) add commented out default settings to gora.properties files - posted by "Lewis John McGibbney (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/11 21:33:40 UTC, 0 replies.
- [jira] [Commented] (NUTCH-809) Parse-metatags plugin - posted by "Dean Del Ponte (Commented) (JIRA)" <ji...@apache.org> on 2012/01/11 21:51:45 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1189) add commented out default settings to gora.properties files - posted by "Hudson (Commented) (JIRA)" <ji...@apache.org> on 2012/01/12 05:19:40 UTC, 0 replies.
- [jira] [Updated] (NUTCH-809) Parse-metatags plugin - posted by "Elisabeth Adler (Updated) (JIRA)" <ji...@apache.org> on 2012/01/12 10:15:39 UTC, 0 replies.
- [jira] [Created] (NUTCH-1246) Upgrade to Hadoop 1.0.0 - posted by "Julien Nioche (Created) (JIRA)" <ji...@apache.org> on 2012/01/12 16:41:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1246) Upgrade to Hadoop 1.0.0 - posted by "Julien Nioche (Commented) (JIRA)" <ji...@apache.org> on 2012/01/12 17:11:40 UTC, 3 replies.
- [jira] [Created] (NUTCH-1247) CrawlDatum.retries should be int - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/12 19:41:40 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1247) CrawlDatum.retries should be int - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/12 19:57:39 UTC, 9 replies.
- [jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?" - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/12 22:45:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1031) Delegate parsing of robots.txt to crawler-commons - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/12 23:07:39 UTC, 0 replies.
- [jira] [Created] (NUTCH-1248) Generator to select on status - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/13 12:12:39 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1177) Generator to select on retry interval - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/13 13:14:40 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1177) Generator to select on retry interval - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/13 15:31:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1248) Generator to select on status - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/13 15:35:39 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1248) Generator to select on status - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/13 17:44:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1248) Generator to select on status - posted by "Hudson (Commented) (JIRA)" <ji...@apache.org> on 2012/01/13 18:18:39 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "PluginCentral" by ElisabethAdler - posted by Apache Wiki <wi...@apache.org> on 2012/01/14 15:17:27 UTC, 0 replies.
- [Nutch Wiki] Update of "IndexMetatags" by ElisabethAdler - posted by Apache Wiki <wi...@apache.org> on 2012/01/14 16:10:46 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1726 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/14 16:51:50 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1176) Fix all javadoc warnings from nightly builds - posted by "Hudson (Commented) (JIRA)" <ji...@apache.org> on 2012/01/14 16:53:39 UTC, 3 replies.
- Build failed in Jenkins: nutch-trunk-maven #108 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/14 17:34:36 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1727 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/14 17:50:42 UTC, 0 replies.
- Jenkins build is back to normal : nutch-trunk-maven #109 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/14 18:26:34 UTC, 0 replies.
- [jira] [Created] (NUTCH-1249) Resolve all issues flagged up by adding javac -Xlint arguement - posted by "Lewis John McGibbney (Created) (JIRA)" <ji...@apache.org> on 2012/01/15 17:00:40 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1730 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/16 05:30:51 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1247) CrawlDatum.retries should be int - posted by "Sebastian Nagel (Updated) (JIRA)" <ji...@apache.org> on 2012/01/17 15:31:40 UTC, 0 replies.
- [jira] [Created] (NUTCH-1250) parse-html does not parse links with empty anchor - posted by "Andreas Janning (Created) (JIRA)" <ji...@apache.org> on 2012/01/17 17:27:41 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment - posted by "Markus Jelsma (Assigned) (JIRA)" <ji...@apache.org> on 2012/01/17 19:06:40 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/17 19:10:40 UTC, 4 replies.
- [jira] [Commented] (NUTCH-1201) Allow for different FetcherThread impls - posted by "Edward Drapkin (Commented) (JIRA)" <ji...@apache.org> on 2012/01/17 19:50:39 UTC, 6 replies.
- I want to volunteer some time - posted by Eddie Drapkin <ed...@wolfram.com> on 2012/01/17 20:07:06 UTC, 5 replies.
- [jira] [Created] (NUTCH-1251) Deletion of duplicates fails with org.apache.solr.client.solrj.SolrServerException - posted by "Arkadi Kosmynin (Created) (JIRA)" <ji...@apache.org> on 2012/01/17 23:41:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1251) Deletion of duplicates fails with org.apache.solr.client.solrj.SolrServerException - posted by "Arkadi Kosmynin (Updated) (JIRA)" <ji...@apache.org> on 2012/01/17 23:45:40 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1251) Deletion of duplicates fails with org.apache.solr.client.solrj.SolrServerException - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/17 23:53:40 UTC, 1 replies.
- [jira] [Created] (NUTCH-1252) SegmentReader -get shows wrong data - posted by "Sebastian Nagel (Created) (JIRA)" <ji...@apache.org> on 2012/01/18 09:59:40 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1252) SegmentReader -get shows wrong data - posted by "Sebastian Nagel (Updated) (JIRA)" <ji...@apache.org> on 2012/01/18 09:59:41 UTC, 3 replies.
- [jira] [Created] (NUTCH-1253) Incompatible neko and xerces versions - posted by "Dennis Spathis (Created) (JIRA)" <ji...@apache.org> on 2012/01/18 15:26:40 UTC, 0 replies.
- [jira] [Created] (NUTCH-1254) NTLMv2 is not supported and HttpClient returns error code 500 - posted by "Remi Tassing (Created) (JIRA)" <ji...@apache.org> on 2012/01/18 15:46:39 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1254) NTLMv2 is not supported and HttpClient returns error code 500 - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/18 16:06:39 UTC, 2 replies.
- [jira] [Closed] (NUTCH-1254) NTLMv2 is not supported and HttpClient returns error code 500 - posted by "Lewis John McGibbney (Closed) (JIRA)" <ji...@apache.org> on 2012/01/18 16:34:39 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1086) Rewrite protocol-httpclient - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/18 16:36:40 UTC, 5 replies.
- Get target URL of redirects - posted by Markus Jelsma <ma...@openindex.io> on 2012/01/19 17:25:13 UTC, 2 replies.
- make nutch plugin to get termfreqvectors - posted by Ale <at...@yahoo.com> on 2012/01/20 00:36:53 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1205) Upgrade gora modules to 0.2-incubating in ivy/ivy.xml - posted by "Ferdy Galema (Updated) (JIRA)" <ji...@apache.org> on 2012/01/20 11:03:40 UTC, 2 replies.
- minor suggestion to ivy.xml of plugins (remove nutch.root property) - posted by Ferdy Galema <fe...@kalooga.com> on 2012/01/20 12:21:44 UTC, 1 replies.
- [jira] [Created] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property - posted by "Ferdy Galema (Created) (JIRA)" <ji...@apache.org> on 2012/01/20 15:44:39 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1205) Upgrade gora modules to 0.2-incubating in ivy/ivy.xml - posted by "Ferdy Galema (Commented) (JIRA)" <ji...@apache.org> on 2012/01/20 16:41:41 UTC, 0 replies.
- [DISCUSS] Issues with Fetcher - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/01/20 18:16:07 UTC, 8 replies.
- [jira] [Commented] (NUTCH-965) Skip parsing for truncated documents - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/21 23:35:40 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property - posted by "Ferdy Galema (Updated) (JIRA)" <ji...@apache.org> on 2012/01/23 15:02:40 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property - posted by "Ferdy Galema (Closed) (JIRA)" <ji...@apache.org> on 2012/01/23 15:04:41 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property - posted by "Hudson (Commented) (JIRA)" <ji...@apache.org> on 2012/01/24 05:15:39 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1201) Allow for different FetcherThread impls - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/24 17:52:41 UTC, 0 replies.
- [jira] [Created] (NUTCH-1256) WebGraph to dump host + score - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/24 18:26:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1257) Support for the x-robots-tag HTTP Header - posted by "Mike (Created) (JIRA)" <ji...@apache.org> on 2012/01/25 08:59:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1258) MoreIndexingFilter should be able to read Content-Type from both parse metadata and content metadata - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/25 12:28:40 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1258) MoreIndexingFilter should be able to read Content-Type from both parse metadata and content metadata - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/25 12:46:40 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1258) MoreIndexingFilter should be able to read Content-Type from both parse metadata and content metadata - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/25 13:00:42 UTC, 3 replies.
- [jira] [Created] (NUTCH-1259) TikaParser should not add Content-Type from HTTP Headers to Nutch Metadata - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/25 14:36:40 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1259) TikaParser should not add Content-Type from HTTP Headers to Nutch Metadata - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/25 14:38:40 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1253) Incompatible neko and xerces versions - posted by "Ferdy Galema (Commented) (JIRA)" <ji...@apache.org> on 2012/01/25 14:44:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1256) WebGraph to dump host + score - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/25 17:08:40 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Sebastian Nagel (Commented) (JIRA)" <ji...@apache.org> on 2012/01/25 17:42:44 UTC, 0 replies.
- [jira] [Created] (NUTCH-1260) Fetcher should log fetching of redirects - posted by "Sebastian Nagel (Created) (JIRA)" <ji...@apache.org> on 2012/01/27 13:18:52 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1260) Fetcher should log fetching of redirects - posted by "Sebastian Nagel (Updated) (JIRA)" <ji...@apache.org> on 2012/01/27 13:20:43 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-1260) Fetcher should log fetching of redirects - posted by "Markus Jelsma (Assigned) (JIRA)" <ji...@apache.org> on 2012/01/27 14:03:41 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1260) Fetcher should log fetching of redirects - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/27 14:15:22 UTC, 0 replies.
- % of different content types out there on the web - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2012/01/28 03:01:26 UTC, 5 replies.
- [jira] [Commented] (NUTCH-1260) Fetcher should log fetching of redirects - posted by "Hudson (Commented) (JIRA)" <ji...@apache.org> on 2012/01/28 05:21:10 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #146 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/29 05:11:35 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1742 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/29 05:11:57 UTC, 0 replies.
- Why Nutch is not crawling all links from web page - posted by tahere ganjiyar <ta...@gmail.com> on 2012/01/29 15:03:02 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-nutchgora #147 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/30 05:14:32 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1743 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/01/30 05:26:03 UTC, 0 replies.
- [jira] [Created] (NUTCH-1261) Make numReducers configurable for indexer - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/30 11:33:10 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1256) WebGraph to dump host + score - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2012/01/30 11:33:13 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1261) Make numReducers configurable for indexer - posted by "Julien Nioche (Commented) (JIRA)" <ji...@apache.org> on 2012/01/30 11:51:10 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1261) Make numReducers configurable for indexer - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/30 12:29:10 UTC, 0 replies.
- [jira] [Created] (NUTCH-1262) Map `duplicating` content-types to a single type - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/01/31 10:16:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1262) Map `duplicating` content-types to a single type - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/31 11:02:10 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1262) Map `duplicating` content-types to a single type - posted by "Julien Nioche (Commented) (JIRA)" <ji...@apache.org> on 2012/01/31 11:30:10 UTC, 1 replies.
- [jira] [Created] (NUTCH-1263) FetcherJob must put 'fetchTime' on input - posted by "Ferdy Galema (Created) (JIRA)" <ji...@apache.org> on 2012/01/31 14:00:12 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1263) FetcherJob must put 'fetchTime' on input - posted by "Ferdy Galema (Updated) (JIRA)" <ji...@apache.org> on 2012/01/31 14:02:10 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1256) WebGraph to dump host + score - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/31 15:20:10 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2012/01/31 16:27:10 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1081) ant tests fail - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2012/01/31 22:27:59 UTC, 0 replies.