You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Commented] (NUTCH-1480) SolrIndexer to write to multiple servers. - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/04 19:28:02 UTC, 16 replies.
- [jira] [Commented] (NUTCH-1129) Any23 Nutch plugin - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/05 03:35:02 UTC, 6 replies.
- [jira] [Commented] (NUTCH-2375) Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/05 08:40:00 UTC, 51 replies.
- Nutch crawl for specifice word not for specific url Then get the structure data and store in hbase. - posted by Muhammad UMER <mu...@hotmail.com> on 2017/09/05 10:01:01 UTC, 0 replies.
- [jira] [Created] (NUTCH-2418) NPE in org.apache.hadoop.io.Text from FetcherThread - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/05 15:42:00 UTC, 0 replies.
- Irma - posted by BlackIce <bl...@gmail.com> on 2017/09/06 00:31:51 UTC, 0 replies.
- How Nutch crawl for specifice word not for specific url Then get the structure data and store in hbase. - posted by Muhammad UMER <mu...@hotmail.com> on 2017/09/06 06:37:59 UTC, 0 replies.
- [jira] [Created] (NUTCH-2419) Domain blacklist URL filter does not respect command-line override for file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/06 10:07:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2417) Support for variable fetch delay via FreeGenerator - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/06 10:09:00 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2417) Support for variable fetch delay via FreeGenerator - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/06 10:14:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2419) Domain blacklist URL filter does not respect command-line override for file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/06 10:15:00 UTC, 0 replies.
- Request for Review - posted by lewis john mcgibbney <le...@apache.org> on 2017/09/06 20:57:43 UTC, 5 replies.
- [jira] [Commented] (NUTCH-2419) Domain blacklist URL filter does not respect command-line override for file - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/10 19:01:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2416) Fetcher to log thread ID - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/10 20:04:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2420) Bug in variable generate.max.count and fetcher.server.delay - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/11 10:31:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2420) Bug in variable generate.max.count and fetcher.server.delay - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/11 10:50:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2409) Injector: complete command-line help and counters - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/11 10:54:00 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-2409) Injector: complete command-line help and counters - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/11 10:56:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2397) Parser to add paragraph line breaks - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/11 10:59:00 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-2397) Parser to add paragraph line breaks - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/11 11:00:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2387) Nutch should not index document with "noindex" meta - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/11 16:57:00 UTC, 0 replies.
- [Nutch Wiki] Update of "ContributorsGroup" by SebastianNagel - posted by Apache Wiki <wi...@apache.org> on 2017/09/12 19:59:45 UTC, 0 replies.
- [Nutch Wiki] Update of "NutchTutorial" by SebastianNagel - posted by Apache Wiki <wi...@apache.org> on 2017/09/18 14:15:03 UTC, 0 replies.
- [jira] [Created] (NUTCH-2421) parse-html to prioritize HTML5 charset definitions - posted by "Laurent Hervaud (JIRA)" <ji...@apache.org> on 2017/09/19 10:54:01 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2421) parse-html to prioritize HTML5 charset definitions - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/19 16:00:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2422) Update information aboute git repository - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/19 16:33:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2422) Update information about git repository - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/19 17:35:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2423) Update contributor info page - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/19 17:36:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2424) Mirror git repository to gitlab.com - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/19 17:42:01 UTC, 0 replies.
- [jira] [Created] (NUTCH-2425) Update GettingNutchRunningWithUbuntu wiki article - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/19 18:34:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2425) Update GettingNutchRunningWithUbuntu wiki article - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/19 19:57:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2426) Provide reason for job failure in job overview - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/19 20:33:01 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2425) Update GettingNutchRunningWithUbuntu wiki article - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/20 08:08:01 UTC, 1 replies.
- [jira] [Created] (NUTCH-2427) Remove all the Hadoop wildcard imports. - posted by "Omkar Reddy (JIRA)" <ji...@apache.org> on 2017/09/20 09:07:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2427) Remove all the Hadoop wildcard imports. - posted by "Omkar Reddy (JIRA)" <ji...@apache.org> on 2017/09/20 09:13:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2428) Provide binary release for Nutch - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/20 09:30:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2428) Provide binary release for Nutch - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/20 13:37:00 UTC, 0 replies.
- Subscription request - posted by Raffaele Palmieri <rp...@apache.org> on 2017/09/20 16:28:52 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2424) Mirror git repository to gitlab.com - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/20 16:40:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2292) Mavenize the build for nutch-core and nutch-plugins - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/09/20 16:59:00 UTC, 0 replies.
- Maven configuration - posted by Raffaele Palmieri <ra...@gmail.com> on 2017/09/21 16:33:43 UTC, 0 replies.
- [jira] [Created] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers - posted by "Hiran Chaudhuri (JIRA)" <ji...@apache.org> on 2017/09/22 10:23:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/22 10:25:00 UTC, 4 replies.
- [jira] [Created] (NUTCH-2430) Complete plugin build configuration - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/22 10:55:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2430) Complete plugin build configuration - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/22 11:02:00 UTC, 5 replies.
- [jira] [Commented] (NUTCH-2235) Classpath discrepancy with protocol-selenium in deploy mode - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/22 11:10:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2212) Decrease memory consumption by tuning stack size - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/22 11:12:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2212) Decrease memory consumption by tuning stack size - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/22 11:12:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/22 11:23:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2235) Classpath discrepancy with protocol-selenium in deploy mode - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2017/09/22 18:18:01 UTC, 0 replies.
- [jira] [Created] (NUTCH-2431) Filterchecker to implement Tool-interface - posted by "Jurian Broertjes (JIRA)" <ji...@apache.org> on 2017/09/25 10:08:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2431) Filterchecker to implement Tool-interface - posted by "Jurian Broertjes (JIRA)" <ji...@apache.org> on 2017/09/25 10:09:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2432) Protocol httpclient to disable cookies if http.enable.cookie.header is false - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/25 10:56:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2432) Protocol httpclient to disable cookies if http.enable.cookie.header is false - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/25 11:01:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2418) NPE in org.apache.hadoop.io.Text from FetcherThread - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/25 13:59:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2430) Complete plugin build configuration - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/25 14:58:00 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2430) Complete plugin build configuration - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/25 14:58:00 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3454 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2017/09/25 15:51:41 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3455 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2017/09/25 16:18:02 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2261) ParseSegment job does not pass metadata for content-level redirects - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/25 16:27:00 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2261) ParseSegment job does not pass metadata for content-level redirects - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/25 16:27:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2135) Ant Eclipse build does not include protocol-interactiveselenium - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/26 08:16:02 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2135) Ant Eclipse build does not include protocol-interactiveselenium - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/26 08:17:00 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2135) Ant Eclipse build does not include protocol-interactiveselenium - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/26 08:17:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2407) Memory leak causing Nutch Server to run out of memory - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/26 13:43:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2433) Html Parser: keep htmltag where the outlinks are found - posted by "Marcos Bori (JIRA)" <ji...@apache.org> on 2017/09/26 15:10:02 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2433) Html Parser: keep htmltag where the outlinks are found - posted by "Marcos Bori (JIRA)" <ji...@apache.org> on 2017/09/27 06:45:00 UTC, 1 replies.
- [jira] [Created] (NUTCH-2434) Option to reset parameters HTMLMetaTags - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/27 09:52:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2434) Option to reset parameters HTMLMetaTags - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/27 09:53:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2435) New configuration allowing to choose whether to store 'parse_text' directory or not. - posted by "Marcos Bori (JIRA)" <ji...@apache.org> on 2017/09/27 11:06:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2435) New configuration allowing to choose whether to store 'parse_text' directory or not. - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/27 11:13:00 UTC, 7 replies.
- [jira] [Commented] (NUTCH-2436) Remove empty comment, and redundant semicolon from CommandRunner - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/27 21:48:00 UTC, 5 replies.
- [jira] [Created] (NUTCH-2436) Remove empty comment, and redundant semicolon from CommandRunner - posted by "kenneth mcfarland (JIRA)" <ji...@apache.org> on 2017/09/27 21:48:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2418) NPE in org.apache.hadoop.io.Text from FetcherThread - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/28 12:20:01 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2418) NPE in org.apache.hadoop.io.Text from FetcherThread - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/28 12:21:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2418) NPE in org.apache.hadoop.io.Text from FetcherThread - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/09/28 12:21:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2436) Remove empty comment, and redundant semicolon from CommandRunner - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2017/09/28 18:54:09 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2436) Remove empty comment, and redundant semicolon from CommandRunner - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2017/09/28 18:55:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2436) Remove empty comment, and redundant semicolon from CommandRunner - posted by "kenneth mcfarland (JIRA)" <ji...@apache.org> on 2017/09/28 20:44:00 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2268) SolrIndexerJob: java.lang.RuntimeException - posted by "Ronan (JIRA)" <ji...@apache.org> on 2017/09/29 09:43:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2433) Html Parser: keep htmltag where the outlinks are found - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/09/29 11:50:02 UTC, 0 replies.