You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Created] (NUTCH-2751) nutch clean does not work with secured solr cloud - posted by "Daniel Hammling (Jira)" <ji...@apache.org> on 2019/11/01 14:12:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2751) nutch clean does not work with secured solr cloud - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/04 10:33:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2751) nutch clean does not work with secured solr cloud - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/04 10:45:00 UTC, 2 replies.
- [jira] [Comment Edited] (NUTCH-2751) nutch clean does not work with secured solr cloud - posted by "Daniel Hammling (Jira)" <ji...@apache.org> on 2019/11/04 11:45:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2752) indexer-solr: Upgrade to latest Solr version - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/04 12:09:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2752) indexer-solr: Upgrade to latest Solr version - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/04 12:11:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2747) Replace remaining o.a.commons.logging by org.slf4j - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/07 08:02:00 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-2747) Replace remaining o.a.commons.logging by org.slf4j - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/07 08:03:01 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2746) Basic URL normalizer to normalize Unicode domain names - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/07 08:04:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1559) parse-metatags duplicates extracted metatags - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/07 08:24:00 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1559) parse-metatags duplicates extracted metatags - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/07 08:27:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1559) parse-metatags duplicates extracted metatags - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/07 08:27:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1337) WebGraph to follow redirects - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/07 09:11:00 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/07 10:41:01 UTC, 12 replies.
- [jira] [Updated] (NUTCH-2748) Fetch status gone (redirect exceeded) not to overwrite existing items in CrawlDb - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/08 11:57:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2748) Fetch status gone (redirect exceeded) not to overwrite existing items in CrawlDb - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/08 12:06:00 UTC, 3 replies.
- Static source code anlysis via sonarcloud.io - posted by lewis john mcgibbney <le...@apache.org> on 2019/11/08 16:50:42 UTC, 1 replies.
- [jira] [Created] (NUTCH-2753) Add -listen option to command-line help of CrawlDbReader and LinkDbReader - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/09 10:03:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2750) improve CrawlDbReader & LinkDbReader reader handling - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/09 10:08:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2750) Improve CrawlDbReader & LinkDbReader reader handling - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 14:29:00 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2750) Improve CrawlDbReader & LinkDbReader reader handling - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/12 14:30:00 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-2750) Improve CrawlDbReader & LinkDbReader reader handling - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 14:39:01 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2720) ROBOTS metatag ignored when capitalized - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 14:40:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2662) index-jexl-filter plugin throws a RuntimeException if its enabled but not configured - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 14:41:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2634) Some links marked as "nofollow" are followed anyway. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 14:47:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2608) Reduce size of Nutch job file and package - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 14:47:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2599) charset detection issue with parse-tika - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 14:51:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2603) Bring back legacy pre-Tika parsers and use them as back up parsers - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 14:51:00 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3653 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2019/11/12 14:54:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2582) Set pool size of XML SAX parsers used for MIME detection in Tika 1.19 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 15:03:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2567) parse-metatags writes all meta tags twice - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 15:05:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2546) parse-(metatags|html) plugin - "meta property" not extracted only "meta name" - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 15:05:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2541) Non-ASCII characters in the URL path are not properly escaped by the protocol-httpclient plugin - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 15:05:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2532) Throw error if HBase is not available while running nutch commands. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 15:08:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2531) Unclear steps in Nutch2 Tutorial - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 15:09:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2532) Throw error if HBase is not available while running nutch commands. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 15:09:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2531) Unclear steps in Nutch2 Tutorial - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/12 15:09:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2531) Unclear steps in Nutch2 Tutorial - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2019/11/13 03:25:00 UTC, 1 replies.
- [jira] [Created] (NUTCH-2754) fetcher.max.crawl.delay ignored if exceeding 5 min. / 300 sec. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/13 13:52:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2754) fetcher.max.crawl.delay ignored if exceeding 5 min. / 300 sec. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/13 14:45:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2529) "ant runtime" warns? about "Could not load definitions from resource org/sonar/ant/antlib.xml. It could not be found." - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:45:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2507) NutchTutorial wiki pages as a lot of outdated command line calls when it starts with the solr interaction - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:45:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2529) "ant runtime" warns? about "Could not load definitions from resource org/sonar/ant/antlib.xml. It could not be found." - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:45:00 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2504) Results of maxCountExpr and fetchDelayExpr should be stored in memory in Generate - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:46:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2496) Speed up link inversion step in crawling script - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:47:00 UTC, 2 replies.
- [jira] [Updated] (NUTCH-2495) Use -deleteGone instead of clean job in crawler script while indexing - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:47:01 UTC, 2 replies.
- [jira] [Updated] (NUTCH-2481) HostDatum deltas(previous step statistics) and Metadata expressions - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:48:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2479) urlmeta plugin port from 1.x to 2.x - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:48:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2479) urlmeta plugin port from 1.x to 2.x - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:48:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2471) Returning a bare string meant to be application/json doesn't properly quote the string - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:49:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2471) Returning a bare string meant to be application/json doesn't properly quote the string - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:49:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2462) Cleanup Tika Boilerpipe patch - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:49:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2462) Cleanup Tika Boilerpipe patch - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:49:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2453) FTP protocol seems to have issues running multithreaded - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:50:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2455) Speed up the merging of HostDb entries for variable fetch delay - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:50:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2428) Provide binary release for Nutch - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:51:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2449) Usage of Tika LanguageIdentifier in language-identifier plugin - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:51:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2434) Option to reset parameters HTMLMetaTags - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:51:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2428) Provide binary release for Nutch - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:51:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2426) Provide reason for job failure in job overview - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:52:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2425) Update GettingNutchRunningWithUbuntu wiki article - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:52:00 UTC, 2 replies.
- [jira] [Updated] (NUTCH-2426) Provide reason for job failure in job overview - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:52:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2423) Update contributor info page - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:52:00 UTC, 3 replies.
- [jira] [Updated] (NUTCH-2421) parse-html to prioritize HTML5 charset definitions - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:55:00 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2425) Update GettingNutchRunningWithUbuntu wiki article - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:55:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2396) Cannot stop or abort fetch job via REST API - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:56:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2407) Memory leak causing Nutch Server to run out of memory - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:56:00 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2385) 1.x Elasticsearch Indexer - path.home is not configured - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:57:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2361) Deprecated nutch and solr integration documentation. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:59:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2379) crawl script dedup's crawldb update is slow - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 10:59:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2361) Deprecated nutch and solr integration documentation. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 11:56:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2343) Calling nutch extension points before custom plugin - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 11:57:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2343) Calling nutch extension points before custom plugin - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 11:57:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2341) bin/crawl do not fetch batchId generated by bash script - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 11:58:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2332) Indexer-elastic2 plugin availability timeline - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 11:58:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2341) bin/crawl do not fetch batchId generated by bash script - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 11:58:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2332) Indexer-elastic2 plugin availability timeline - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 11:58:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2230) Nutch doesn't index all URLs found - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 12:11:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2270) Solr indexer Failed i - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 12:11:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2230) Nutch doesn't index all URLs found - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/15 12:12:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2270) Solr indexer Failed i - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:13:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2681) ClassCastException - Apache Nutch 1.x, Selenium v2.48.2, firefox 31.4.0 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:16:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2681) ClassCastException - Apache Nutch 1.x, Selenium v2.48.2, firefox 31.4.0 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:18:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2586) Add a fallback mechanism for missing meta tags - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:18:00 UTC, 2 replies.
- [jira] [Updated] (NUTCH-2076) exceptions are not handled when using method waitForCompletion in a try block - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:21:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2075) Generate will not choose URL without distance marker - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:22:02 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2075) Generate will not choose URL without distance marker - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:22:02 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2076) exceptions are not handled when using method waitForCompletion in a try block - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:22:02 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2323) ElasticSearch Indexer does not work on Nutch 2.3.1 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:23:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2323) ElasticSearch Indexer does not work on Nutch 2.3.1 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:23:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2331) REST API Fetch fails to retrieve HDFS path on distributed mode - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:23:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2755) Remove obsolete plugin indexer-elastic-rest - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:30:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2755) Remove obsolete plugin indexer-elastic-rest - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:31:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:33:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2318) Text extraction in HtmlParser adds too much whitespace. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 13:48:00 UTC, 1 replies.
- Jenkins build is back to normal : Nutch-trunk #3654 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2019/11/22 14:00:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2293) Make the unit tests which requires "plugin.folders" as integration tests - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 14:02:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2318) Text extraction in HtmlParser adds too much whitespace. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 14:02:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2277) Adding goldstandard.txt default file in conf - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 14:03:00 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2275) MD5Signature by default doesn't take in account parse - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 14:44:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2275) MD5Signature by default doesn't take in account parse - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:03:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2268) SolrIndexerJob: java.lang.RuntimeException - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:04:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2274) InteractiveSelenium Plugin's DefaultHandler Returns Null - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:04:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2268) SolrIndexerJob: java.lang.RuntimeException - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:04:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2265) Write A Test Package for Scoring Similarity - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:05:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2253) ProtocolFactory still not thread-safe - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:13:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2249) WordNet Integration for Cosine Similarity - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:14:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2240) ava.lang.NoSuchFieldError: INSTANCE selenium nutch - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:16:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2134) Redirection and cookie handling using protocol plugins - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:19:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2131) Problem running nutch(crawl) with selenium - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:21:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2126) Use selenium protocol for specific sites - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:24:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2103) Nutch 2.3 has an old version of hbase jar in runtime/lib folder - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:25:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2113) Need documentation for using various Gora backends - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:25:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2118) browser requests sometimes timeout when using the selenium grid because of port access issues - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:25:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2113) Need documentation for using various Gora backends - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:25:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2103) Nutch 2.3 has an old version of hbase jar in runtime/lib folder - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:26:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2076) exceptions are not handled when using method waitForCompletion in a try block - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:28:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2075) Generate will not choose URL without distance marker - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:28:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2532) Throw error if HBase is not available while running nutch commands. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:30:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2032) Plugin to index the raw content of a readable document. - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:43:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2024) httpcore classpath jar conflict when invoking protocol-selenium - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:46:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2003) topN is not work correctly - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:46:01 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2002) ParserChecker to check robots.txt - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:47:00 UTC, 1 replies.
- [jira] [Closed] (NUTCH-2003) topN is not work correctly - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:47:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1999) Add http://nutch.apache.org/robots.txt - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:47:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1984) Eliminate unnecessary dependencies - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:48:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1984) Eliminate unnecessary dependencies - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:48:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1971) The crawldb.url.filters property is not present in any configuration file - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:50:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1971) The crawldb.url.filters property is not present in any configuration file - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 15:50:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/22 17:45:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2746) Basic URL normalizer to normalize Unicode domain names - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/22 17:53:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2746) Basic URL normalizer to normalize Unicode domain names - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/11/22 17:55:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2202) Integration of Anthelion (Focused Crawling Module) into Nutch - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/23 11:17:00 UTC, 3 replies.