You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Created] (NUTCH-2437) gora mongodb mapping file error - posted by "Tulay Muezzinoglu (JIRA)" <ji...@apache.org> on 2017/10/04 02:42:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2437) gora mongodb mapping file error - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/04 03:00:00 UTC, 5 replies.
- [jira] [Commented] (NUTCH-2374) Upgrade Nutch 2.X to Gora 0.7 - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/04 03:24:00 UTC, 4 replies.
- [jira] [Updated] (NUTCH-2374) Upgrade Nutch 2.X to Gora 0.7 - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2017/10/04 17:42:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2374) Upgrade Nutch 2.X to Gora 0.7 - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2017/10/04 17:42:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2437) gora mongodb mapping file error - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2017/10/04 17:44:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2438) Upgrade Nutch 2.X to Gora 0.8 - posted by "Tulay Muezzinoglu (JIRA)" <ji...@apache.org> on 2017/10/04 19:23:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2435) New configuration allowing to choose whether to store 'parse_text' directory or not. - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/05 08:21:00 UTC, 2 replies.
- [jira] [Commented] (NUTCH-2317) Plugin jars don't get added to classpath while running in local - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/05 10:51:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/05 10:53:00 UTC, 45 replies.
- [jira] [Commented] (NUTCH-2438) Upgrade Nutch 2.X to Gora 0.8 - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/05 20:51:00 UTC, 4 replies.
- [jira] [Commented] (NUTCH-2424) Mirror git repository to gitlab.com - posted by "Jorge Luis Betancourt Gonzalez (JIRA)" <ji...@apache.org> on 2017/10/09 12:20:00 UTC, 0 replies.
- [Build failure] Connection Reset when downloading closure-compiler-v20130603.jar - posted by Madhawa Kasun Gunasekara <ma...@gmail.com> on 2017/10/09 20:36:15 UTC, 6 replies.
- [jira] [Created] (NUTCH-2439) Upgrade to Apache Tika 1.16 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/11 13:35:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2439) Upgrade to Apache Tika 1.16 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/11 13:37:00 UTC, 3 replies.
- [jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.16 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/11 15:53:00 UTC, 2 replies.
- [jira] [Created] (NUTCH-2440) DbResource does not accept crawlid - posted by "Tulay Muezzinoglu (JIRA)" <ji...@apache.org> on 2017/10/12 00:14:00 UTC, 0 replies.
- Re: Not correct usage of key name(segments) in the RestAPI. - posted by Sebastian Nagel <wa...@googlemail.com> on 2017/10/12 09:17:48 UTC, 0 replies.
- [Nutch Wiki] Update of "FrontPage" by ChrisMattmann - posted by Apache Wiki <wi...@apache.org> on 2017/10/12 16:11:36 UTC, 0 replies.
- [Nutch Wiki] Update of "NaiveBayesParseFilter" by ChrisMattmann - posted by Apache Wiki <wi...@apache.org> on 2017/10/12 16:13:51 UTC, 0 replies.
- Styles - posted by kenneth mcfarland <ke...@gmail.com> on 2017/10/14 01:21:19 UTC, 2 replies.
- [jira] [Commented] (NUTCH-2323) ElasticSearch Indexer does not work on Nutch 2.3.1 - posted by "jackson Pollock (JIRA)" <ji...@apache.org> on 2017/10/14 18:19:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2441) ARG_SEGMENT usage - posted by "Semyon Semyonov (JIRA)" <ji...@apache.org> on 2017/10/16 14:32:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2441) ARG_SEGMENT usage - posted by "Semyon Semyonov (JIRA)" <ji...@apache.org> on 2017/10/16 14:35:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2442) Injector to stop if job fails to avoid loss of CrawlDb - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/17 09:02:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2442) Injector to stop if job fails to avoid loss of CrawlDb - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/17 09:26:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2443) Extract links from the video tag with the parse-html plugin - posted by "Jorge Luis Betancourt Gonzalez (JIRA)" <ji...@apache.org> on 2017/10/17 11:00:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2443) Extract links from the video tag with the parse-html plugin - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/17 15:27:00 UTC, 3 replies.
- [jira] [Updated] (NUTCH-2411) Index-metadata to support indexing multiple values for a field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/17 15:56:00 UTC, 2 replies.
- [jira] [Commented] (NUTCH-2411) Index-metadata to support indexing multiple values for a field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/18 10:39:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1465) Support sitemaps in Nutch - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/18 15:39:00 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2033) parse-tika skips valid documents. - posted by "Luis Lopez (JIRA)" <ji...@apache.org> on 2017/10/19 05:54:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2407) Memory leak causing Nutch Server to run out of memory - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/19 21:22:01 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2435) New configuration allowing to choose whether to store 'parse_text' directory or not. - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/19 21:30:00 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2435) New configuration allowing to choose whether to store 'parse_text' directory or not. - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/19 21:30:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2435) New configuration allowing to choose whether to store 'parse_text' directory or not. - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/19 21:30:01 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1763) Improving comments on the Injector Class - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/19 21:37:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1763) Improving comments on the Injector Class - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/19 21:37:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1763) Improving comments on the Injector Class - posted by "Hudson (JIRA)" <ji...@apache.org> on 2017/10/19 22:10:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1932) Automatically remove orphaned pages - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/20 08:41:00 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1932) Automatically remove orphaned pages - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/20 08:52:00 UTC, 9 replies.
- [jira] [Created] (NUTCH-2444) HostDB CSV dumper to emit field header by default - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/20 09:08:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2444) HostDB CSV dumper to emit field header by default - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/20 09:09:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2444) HostDB CSV dumper to emit field header by default - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/20 09:10:00 UTC, 2 replies.
- [jira] [Created] (NUTCH-2445) Fetcher following outlinks to keep track of already fetched items - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/20 12:43:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2445) Fetcher following outlinks to keep track of already fetched items - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/20 12:45:00 UTC, 3 replies.
- [jira] [Created] (NUTCH-2446) URLFiltersCheck fix - posted by "kenneth mcfarland (JIRA)" <ji...@apache.org> on 2017/10/23 07:23:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2446) URLFiltersCheck fix - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/23 07:25:00 UTC, 5 replies.
- [jira] [Closed] (NUTCH-2446) URLFiltersCheck fix - posted by "kenneth mcfarland (JIRA)" <ji...@apache.org> on 2017/10/23 08:04:00 UTC, 1 replies.
- [jira] [Reopened] (NUTCH-2446) URLFiltersCheck fix - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/23 08:09:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2446) URLFiltersCheck fix - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/23 08:09:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2446) URLFiltersCheck fix - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/23 08:11:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2445) Fetcher following outlinks to keep track of already fetched items - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/23 08:38:00 UTC, 2 replies.
- [jira] [Created] (NUTCH-2447) Work-around SSLProtocolException: handshake alert: unrecognized_name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/23 11:35:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2447) Work-around SSLProtocolException: handshake alert: unrecognized_name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/23 11:36:00 UTC, 3 replies.
- [Nutch Wiki] Update of "NutchTutorial" by SebastianNagel - posted by Apache Wiki <wi...@apache.org> on 2017/10/23 12:09:17 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-2447) Work-around SSLProtocolException: handshake alert: unrecognized_name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/23 12:37:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2447) Work-around SSLProtocolException: handshake alert: unrecognized_name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/23 12:38:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2444) HostDB CSV dumper to emit field header by default - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/23 13:23:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2444) HostDB CSV dumper to emit field header by default - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/23 13:23:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2445) Fetcher following outlinks to keep track of already fetched items - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/23 14:00:02 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2445) Fetcher following outlinks to keep track of already fetched items - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/23 14:00:03 UTC, 0 replies.
- [jira] [Created] (NUTCH-2448) Allow Sending an empty http.agent.version - posted by "Yossi Tamari (JIRA)" <ji...@apache.org> on 2017/10/23 14:26:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2448) Allow Sending an empty http.agent.version - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/23 16:28:00 UTC, 8 replies.
- Nutch - Apache Mentor Project Proposal - posted by kenneth mcfarland <ke...@gmail.com> on 2017/10/24 05:48:34 UTC, 1 replies.
- [jira] [Created] (NUTCH-2449) Usage of Tika LanguageIdentifier in language-identifier plugin - posted by "Yossi Tamari (JIRA)" <ji...@apache.org> on 2017/10/24 14:30:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2449) Usage of Tika LanguageIdentifier in language-identifier plugin - posted by "Yossi Tamari (JIRA)" <ji...@apache.org> on 2017/10/24 14:31:00 UTC, 7 replies.
- [jira] [Commented] (NUTCH-2394) Possible bugs in the source code - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/24 19:26:00 UTC, 6 replies.
- [jira] [Updated] (NUTCH-2394) Possible bugs in the source code - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/24 19:26:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2448) Allow Sending an empty http.agent.version - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/24 19:34:00 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-2448) Allow Sending an empty http.agent.version - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/24 19:51:00 UTC, 0 replies.
- ParseOutputFormat - posted by kenneth mcfarland <ke...@gmail.com> on 2017/10/24 20:14:30 UTC, 3 replies.
- [jira] [Commented] (NUTCH-2386) BasicURLNormalizer does not encode curly braces - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/24 20:24:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2450) Remove FixMe in ParseOutputFormat - posted by "Kenneth McFarland (JIRA)" <ji...@apache.org> on 2017/10/24 21:55:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2450) Remove FixMe in ParseOutputFormat - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/24 21:57:00 UTC, 3 replies.
- [jira] [Commented] (NUTCH-2399) indexer-elastic does not index multi-value fields (only the first value is indexed) - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/25 10:39:00 UTC, 5 replies.
- [jira] [Resolved] (NUTCH-2386) BasicURLNormalizer does not encode curly braces - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/25 13:02:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2386) BasicURLNormalizer does not encode curly braces - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2017/10/25 13:02:00 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3463 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2017/10/25 13:52:42 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1932) Automatically remove orphaned pages - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/25 14:52:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2394) Possible bugs in the source code - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2017/10/25 15:02:00 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3464 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2017/10/25 17:09:06 UTC, 0 replies.
- [jira] [Created] (NUTCH-2451) MalformedURLExceptions on perfectly looking URLs? - posted by "Hiran Chaudhuri (JIRA)" <ji...@apache.org> on 2017/10/25 20:39:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2452) Problem retrieving encoded URLs via FTP? - posted by "Hiran Chaudhuri (JIRA)" <ji...@apache.org> on 2017/10/25 20:47:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2453) FTP protocol seems to have issues running multithreaded - posted by "Hiran Chaudhuri (JIRA)" <ji...@apache.org> on 2017/10/25 20:59:01 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2453) FTP protocol seems to have issues running multithreaded - posted by "Hiran Chaudhuri (JIRA)" <ji...@apache.org> on 2017/10/25 21:00:03 UTC, 2 replies.
- [jira] [Updated] (NUTCH-2452) Problem retrieving encoded URLs via FTP? - posted by "Hiran Chaudhuri (JIRA)" <ji...@apache.org> on 2017/10/25 21:00:05 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2451) MalformedURLExceptions on perfectly looking URLs? - posted by "Hiran Chaudhuri (JIRA)" <ji...@apache.org> on 2017/10/25 21:00:09 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2452) Problem retrieving encoded URLs via FTP? - posted by "Hiran Chaudhuri (JIRA)" <ji...@apache.org> on 2017/10/25 21:50:00 UTC, 0 replies.
- Fwd: Maven configuration - posted by Raffaele Palmieri <ra...@gmail.com> on 2017/10/27 14:02:29 UTC, 2 replies.
- Crawler-Commons 0.9 released - posted by Julien Nioche <li...@gmail.com> on 2017/10/31 10:55:43 UTC, 0 replies.