You are viewing a plain text version of this content. The canonical link for it is here.
- Build failed in Jenkins: Nutch-trunk #1443 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/01 06:02:17 UTC, 0 replies.
- [jira] [Updated] (NUTCH-897) Subcollection requires blacklist element - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 15:05:05 UTC, 3 replies.
- Clean up open legacy issues in Jira - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/01 16:03:45 UTC, 2 replies.
- [jira] [Commented] (NUTCH-973) Remove Segment Merger in 1.3 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:05:06 UTC, 0 replies.
- Re: java.sql.BatchUpdateException after fetch and wrong WebPage.protocolStatus in trunk - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/01 16:11:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-973) Remove Segment Merger in 1.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/01 16:21:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-36) Chinese in Nutch - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-39) pagination in search result - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-13) If dns points to 127.0.0.1, the url is also crawled - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-18) Windows servers include illegal characters in URLs - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-103) Vivisimo like treeview and url redirect - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:29:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-79) Fault tolerant searching. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:29:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-83) Release deliverable as zip - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:29:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-18) Windows servers include illegal characters in URLs - posted by "David Escuer (JIRA)" <ji...@apache.org> on 2011/04/01 16:31:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-144) corrupt language identifier tri files and bad language recognition for german - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:31:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-180) Performance problem with widely used keywords - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:31:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-104) Nutch query parser does not support CJK bi-gram segmentation. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:31:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-132) Add ability to sort on more than one column - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:31:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-775) Enhance Searcher interface - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:33:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-581) DistributedSearch does not update search servers added to search-servers.txt on the fly - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:33:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-877) Allow setting of slop values for non-quote phrase queries on query-basic plugin - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:33:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-541) Index url field untokenized - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-674) NutchBean doesn't check for searcher.dir existance. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-943) Search Results default dedup field "site" should be stored in index. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-47) Configure host filter to do wildcard prefixes - *.redhat.com - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-377) Add possibility to search for multiple values - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-470) Adding optional terms to a query - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-265) Getting Clustered results in better form. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-542) Null Pointer Exception on getSummary when segment no longer exists - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-423) Add other index-basic fields as query plugins - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-540) some problem about the Nutch cache - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-355) The title of query result could like the summary have the highlight?? - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-260) Three new plugins that parse, index and query meta tags defined in the configuration - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-72) Query basic filter with correction feature - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-480) Searching multiple indexes with a single nutch instance - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-294) Topic-maps of related searchwords - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-941) Search returns blank page, when there is more than one SOLR server configured - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-453) Move stop words to a config file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-466) Flexible segment format - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-445) Domain İndexing / Query Filter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:08 UTC, 0 replies.
- [jira] [Updated] (NUTCH-708) NutchBean: OOM due to searcher.max.hits and dedup. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:08 UTC, 0 replies.
- [jira] [Updated] (NUTCH-820) Infinite loop when hitspersite is set - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:08 UTC, 0 replies.
- [jira] [Updated] (NUTCH-92) DistributedSearch incorrectly scores results - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:08 UTC, 0 replies.
- [jira] [Updated] (NUTCH-455) dedup on tokenized fields is faulty - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-638) Launching Distributed Searchers with URI indicating filesystem to use rather than relying on hadoop config files. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-386) Plugin to index categories by url rules - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-479) Support for OR queries - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-573) Multiple Domains - Query Search - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:35:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-72) Query basic filter with correction feature - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-540) some problem about the Nutch cache - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-294) Topic-maps of related searchwords - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-943) Search Results default dedup field "site" should be stored in index. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-47) Configure host filter to do wildcard prefixes - *.redhat.com - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-480) Searching multiple indexes with a single nutch instance - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-541) Index url field untokenized - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-445) Domain İndexing / Query Filter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-674) NutchBean doesn't check for searcher.dir existance. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-708) NutchBean: OOM due to searcher.max.hits and dedup. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-573) Multiple Domains - Query Search - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-92) DistributedSearch incorrectly scores results - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-820) Infinite loop when hitspersite is set - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-466) Flexible segment format - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-455) dedup on tokenized fields is faulty - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-479) Support for OR queries - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-453) Move stop words to a config file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-386) Plugin to index categories by url rules - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-638) Launching Distributed Searchers with URI indicating filesystem to use rather than relying on hadoop config files. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-377) Add possibility to search for multiple values - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-355) The title of query result could like the summary have the highlight?? - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-470) Adding optional terms to a query - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-260) Three new plugins that parse, index and query meta tags defined in the configuration - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-941) Search returns blank page, when there is more than one SOLR server configured - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-423) Add other index-basic fields as query plugins - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-265) Getting Clustered results in better form. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-542) Null Pointer Exception on getSummary when segment no longer exists - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:10 UTC, 0 replies.
- [jira] [Closed] (NUTCH-299) Bittorrent Parser - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-343) Index MP3 SHA1 hashes - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-396) mergesegs sorts URLs, making segments useless for subsequent fetch - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-352) Add jar command to bin/nutch to allow launching hadoop job jars - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-326) WordExtractor throws java.util.NoSuchElementException on some documents - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-300) Clustering API improvements - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-316) Confusion about query languages - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-290) parse-pdf: Garbage indexed when text-extraction not allowed - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-358) Language Switching PROBLEM FIXED - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:41:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-26) New Http Authentication mechanism - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-259) Problem in IndexSorter after dedup - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-129) rtf-parser does not work when opened with wordpad files and saved - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-251) Administration GUI - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-283) If the Fetcher times out and abandons Fetcher Threads, severe errors will occur on those Threads - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-48) "Did you mean" query enhancement/refignment feature request - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-162) country code "jp" is used instead of language code "ja" for Japanese - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-281) cached.jsp: base-href needs to be outside comments - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-164) Locale (language) choice by first session has global effect to all sessions - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-267) Indexer doesn't consider linkdb when calculating boost value - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-272) Max. pages to crawl/fetch per site (emergency limit) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-98) RobotRulesParser interprets robots.txt incorrectly - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-158) Process Sitemap data in text, rss or xml format as well as OAI-PMH - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-472) NullPointerException in ZipTextExtractor if no MIME type for zipped file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:59:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-568) Indexer does not update the Lucene "TITLE" field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:59:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-224) Nutch doesn't handle Korean text at all - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:59:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-441) Thai Analyzer Plugin - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:59:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-424) NekoHTML's DOMFragmentParser hangs on certain URLs (CLONE: Problem persists with Nutch 0.9 and 0.8.1 (Nekohtml 0.9.4)) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-289) CrawlDatum should store IP address - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-709) JSParseFilter gets into an infinate loop and ets all the stack - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-87) Efficient site-specific crawling for a large number of sites - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-113) Disable permanent DNS-to-IP caching for JVM 1.4 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-100) New plugin urlfilter-db - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-249) black- white list url filtering - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-182) Log when db.max configuration limits reached - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-119) Regexp to extract outlinks incorrect - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-496) ConcurrentModificationException can be thrown when getSorted() is called. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-414) parse-mp3 plugin concatenating previous tags for text field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-460) RDF parser plugin - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:05:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-742) Checksum Error - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-570) Improvement of URL Ordering in Generator.java - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-50) Benchmarks & Performance goals - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-473) ExcelExtractor performance bad due to String concatenation - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-854) Define standard attributes with values and explaination to configuration files in conf directory - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-86) LanguageIdentifier API enhancements - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-958) Httpclient scheme priority order fix - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-826) Mailing list is broken. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-650) Hbase Integration - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-44) too many search results - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:07 UTC, 0 replies.
- [jira] [Closed] (NUTCH-659) Help! No urls fetched for internal repository website - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-866) STOP Nutch without breaking the crawled data - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-774) Retry interval in crawl date is set to 0 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-310) Review Log Levels - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-591) StringIndexOutOfBoundsException when extracting text from a Word document. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-185) XMLParser is configurable xml parser plugin. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-677) Segment merge filering based on segment content - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-644) RTF parser doesn't compile anymore - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-860) package task fails - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-363) Fetcher normalizes everything at least twice - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-716) Make subcollection index filed multivalued - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:08 UTC, 0 replies.
- [jira] [Closed] (NUTCH-73) A page for CSV results - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-714) Need a SFTP and SCP Protocol Handler - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-152) TaskRunner io pipes are not setDaemon(true), cleanup and exception errors are incomplete, max heap too small - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-564) External parser supports encoding attribute - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-101) RobotRulesParser - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-759) Removal of deprecated APIs - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-751) Upgrade version of HttpClient - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-309) Uses commons logging Code Guards - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:09:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-364) Javascript parser creates some fairly bogus URLs - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:27:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-689) Swf parser doesn't seem to handle relative links - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:27:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-461) microformats-reltag plugin and relative links - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-521) Modified injector to allow newly injected CrawlDatum to overwrite original - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-523) web2 searchform problems with patch - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-519)   prased incorrectly - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-458) Proxy forwarding to nutch.war does not work. Need to add some code... - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:27:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-595) "Target file:/.... already exists" - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 17:27:06 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1444 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/02 06:01:33 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1445 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/03 06:01:36 UTC, 0 replies.
- subscribe - posted by Shadiq Ammar <am...@gmail.com> on 2011/04/03 18:55:57 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1446 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/04 06:01:28 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1447 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/05 06:01:07 UTC, 0 replies.
- [jira] [Created] (NUTCH-974) Parsing Error in Nutch 1.2 on Windows7 - posted by "Niksa Jakovljevic (JIRA)" <ji...@apache.org> on 2011/04/05 10:22:06 UTC, 0 replies.
- [jira] [Commented] (NUTCH-974) Parsing Error in Nutch 1.2 on Windows7 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 13:19:05 UTC, 2 replies.
- [jira] [Assigned] (NUTCH-974) Parsing Error in Nutch 1.2 on Windows7 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 13:23:05 UTC, 0 replies.
- [jira] [Created] (NUTCH-975) Fix missing/wrong headers in source files - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 13:33:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-975) Fix missing/wrong headers in source files - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 14:01:06 UTC, 4 replies.
- [jira] [Created] (NUTCH-976) SolrIndex constants in wrong namespace (or prefix) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 16:56:05 UTC, 0 replies.
- [jira] [Created] (NUTCH-977) SolrMappingReader uses hardcoded configuration parameter name for mapping file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 16:58:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-976) SolrIndex constants in wrong namespace (or prefix) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 17:02:06 UTC, 3 replies.
- [jira] [Commented] (NUTCH-975) Fix missing/wrong headers in source files - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 17:24:05 UTC, 3 replies.
- [jira] [Commented] (NUTCH-897) Subcollection requires blacklist element - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 17:34:05 UTC, 4 replies.
- [jira] [Updated] (NUTCH-977) SolrMappingReader uses hardcoded configuration parameter name for mapping file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/05 17:55:05 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #1448 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/06 06:01:03 UTC, 0 replies.
- [jira] [Commented] (NUTCH-977) SolrMappingReader uses hardcoded configuration parameter name for mapping file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/06 11:55:05 UTC, 2 replies.
- [jira] [Commented] (NUTCH-976) SolrIndex constants in wrong namespace (or prefix) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/06 11:55:05 UTC, 5 replies.
- [jira] [Created] (NUTCH-978) [GSoC 2011] A Plugin for extracting certain element of a web page on html page parsing. - posted by "Ammar Shadiq (JIRA)" <ji...@apache.org> on 2011/04/06 17:11:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-978) [GSoC 2011] A Plugin for extracting certain element of a web page on html page parsing. - posted by "Ammar Shadiq (JIRA)" <ji...@apache.org> on 2011/04/06 17:28:05 UTC, 7 replies.
- [jira] [Commented] (NUTCH-967) Upgrade to Tika 0.9 - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/04/06 19:58:05 UTC, 4 replies.
- Build failed in Jenkins: Nutch-trunk #1449 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/07 06:01:07 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-978) [GSoC 2011] A Plugin for extracting certain element of a web page on html page parsing. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/07 09:58:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-978) [GSoC 2011] A Plugin for extracting certain element of a web page on html page parsing. - posted by "Ammar Shadiq (JIRA)" <ji...@apache.org> on 2011/04/07 10:02:05 UTC, 4 replies.
- [jira] [Issue Comment Edited] (NUTCH-978) [GSoC 2011] A Plugin for extracting certain element of a web page on html page parsing. - posted by "Ammar Shadiq (JIRA)" <ji...@apache.org> on 2011/04/07 10:04:06 UTC, 3 replies.
- [jira] [Updated] (NUTCH-967) Upgrade to Tika 0.9 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/07 13:38:05 UTC, 1 replies.
- [jira] [Issue Comment Edited] (NUTCH-967) Upgrade to Tika 0.9 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/07 13:40:05 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1450 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/08 06:01:10 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-967) Upgrade to Tika 0.9 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/08 12:11:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-944) Increase the number of elements to look for URLs and add the ability to specify multiple attributes by elements - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/08 12:52:06 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-972) Mergedb doesn't merge with empty directory, as is the case with merge (for indexes) - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/08 13:07:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-963) Add support for deleting Solr documents with STATUS_DB_GONE in CrawlDB (404 urls) - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/08 13:21:05 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-963) Add support for deleting Solr documents with STATUS_DB_GONE in CrawlDB (404 urls) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/08 13:26:05 UTC, 0 replies.
- [jira] [Created] (NUTCH-979) Add support for deleting Solr documents with ProtocolStatusCodes.NOTFOUND - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/08 13:40:05 UTC, 0 replies.
- All solr* commands fail in 1.3 - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/08 14:44:50 UTC, 2 replies.
- GORA dependency and build failures - posted by Otis Gospodnetic <og...@yahoo.com> on 2011/04/08 19:55:50 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1451 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/09 06:01:14 UTC, 0 replies.
- Extension point for URL generator - posted by Faux Manuel - S0810239008 <S0...@students.fh-hagenberg.at> on 2011/04/09 21:19:59 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1452 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/10 06:01:03 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1453 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/11 06:01:03 UTC, 0 replies.
- Hi, I hava a question about nutch source code : ) - posted by 소성은 <su...@gmail.com> on 2011/04/11 13:28:40 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #1454 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/12 06:02:20 UTC, 0 replies.
- [jira] [Created] (NUTCH-980) Fix IllegalAccessError with slf4j used in Solrj. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/12 14:38:05 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-897) Subcollection requires blacklist element - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/12 14:52:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-980) Fix IllegalAccessError with slf4j used in Solrj. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/12 15:26:05 UTC, 1 replies.
- [jira] [Commented] (NUTCH-980) Fix IllegalAccessError with slf4j used in Solrj. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/12 15:36:05 UTC, 2 replies.
- Nutch' pom.xml - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/12 16:11:08 UTC, 5 replies.
- [jira] [Created] (NUTCH-981) Add tests for solr* tasks - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/12 16:37:05 UTC, 0 replies.
- ActiveThreads=0 - posted by ah...@accenture.com on 2011/04/12 16:41:36 UTC, 1 replies.
- [jira] [Created] (NUTCH-982) Remove copying of ID and URL field in solrmapping - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/12 17:23:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-982) Remove copying of ID and URL field in solrmapping - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/12 17:37:05 UTC, 3 replies.
- Build failed in Jenkins: Nutch-trunk #1455 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/13 06:00:36 UTC, 0 replies.
- chinese token overlap bug in org.apache.nutch.summary.basic.BasicSummarizer.getSummary - posted by Bupo Jung <bu...@gmail.com> on 2011/04/13 13:13:00 UTC, 2 replies.
- [jira] [Commented] (NUTCH-982) Remove copying of ID and URL field in solrmapping - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/13 14:49:05 UTC, 0 replies.
- [jira] [Created] (NUTCH-983) Upgrade SolrJ - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/13 15:45:12 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-982) Remove copying of ID and URL field in solrmapping - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/13 21:36:06 UTC, 0 replies.
- [jira] [Commented] (NUTCH-944) Increase the number of elements to look for URLs and add the ability to specify multiple attributes by elements - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2011/04/13 23:36:05 UTC, 0 replies.
- Nutch 1.3 release - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/14 00:13:12 UTC, 6 replies.
- [jira] [Closed] (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 00:50:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-478) Add function for stopping FetherThread gracefully - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:00:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-84) Fetcher for constrained crawls - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:10:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-311) Page with tens of thousands of links OOME'd. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:14:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-648) debian style autocomplete - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:22:05 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-657) Estonian N-gram profile has wrong name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:26:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-942) Add user uid from drupal or other cms to the author field of Nutch - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:26:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-657) Estonian N-gram profile has wrong name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:30:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-657) Estonian N-gram profile has wrong name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:32:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-672) allow unit tests to be run from bin/nutch - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:35:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-297) sandbox svn folder - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:37:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-313) moreFrom property in search.properties cannot be translated into Japanese. Compound text issue. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:37:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-435) Synonym-Editor that creates OWL for the ontology plugin - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:37:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-947) text.jsp does not compile on Apache Tomcat, and charset is not specified - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-576) Different Analyzers Support - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-376) Add methods to control runtime behaviour of NutchBean - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:37:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-959) use of "ROWS" destroys result-lists: first hit appears also als last hit on each "page" (search via search?query... -> xml ) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:45:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-558) Need tool to retrieve domain statistics - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:47:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-852) parser not found for contentType=application/xhtml+xml - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-454) Review Debug Level Log Guards - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-778) Running Nutch On linux having whoami exception? - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-805) Unable to resolve the url-blah-blah, skipping - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-791) External links for published javadocs are partially broken - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-733) plain text view of cached files ignores HTML encoding - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-736) how long it takes nutch 1.0 to fetch - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-934) Upgrade to Tika 0.8 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:06 UTC, 0 replies.
- [jira] [Closed] (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 01:49:06 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1456 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/14 06:00:36 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-980) Fix IllegalAccessError with slf4j used in Solrj. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 10:58:05 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-975) Fix missing/wrong headers in source files - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 11:53:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-976) Rename properties solrindex.* to solr.* - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 11:57:05 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-976) Rename properties solrindex.* to solr.* - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 12:03:05 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-977) SolrMappingReader uses hardcoded configuration parameter name for mapping file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 12:09:05 UTC, 0 replies.
- [jira] [Closed] (NUTCH-922) SolrWriter should log source fields that are not mapped - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/14 12:44:05 UTC, 0 replies.
- Precopy http.agent properties to nutch-site - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/14 12:49:54 UTC, 5 replies.
- [jira] [Commented] (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic - posted by "Dietrich Schmidt (JIRA)" <ji...@apache.org> on 2011/04/14 20:52:06 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1457 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/15 06:03:19 UTC, 0 replies.
- [jira] [Commented] (NUTCH-386) Plugin to index categories by url rules - posted by "Richard Hull (JIRA)" <ji...@apache.org> on 2011/04/16 01:02:05 UTC, 3 replies.
- Build failed in Jenkins: Nutch-trunk #1458 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/16 06:03:13 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "Gabriele Kahlout" by Gabriele Kahlout - posted by Apache Wiki <wi...@apache.org> on 2011/04/16 09:15:58 UTC, 0 replies.
- Page Gabriele Kahlout deleted from Nutch Wiki - posted by Apache Wiki <wi...@apache.org> on 2011/04/16 09:20:38 UTC, 0 replies.
- [Nutch Wiki] Update of "Gabriele" by Gabriele - posted by Apache Wiki <wi...@apache.org> on 2011/04/16 09:21:06 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-386) Plugin to index categories by url rules - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/16 12:59:05 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-940) static field plugin - posted by "Marseld Dedgjonaj (JIRA)" <ji...@apache.org> on 2011/04/16 13:03:06 UTC, 0 replies.
- [jira] [Commented] (NUTCH-940) static field plugin - posted by "Marseld Dedgjonaj (JIRA)" <ji...@apache.org> on 2011/04/16 13:03:06 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1459 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/17 06:03:18 UTC, 0 replies.
- [jira] [Commented] (NUTCH-924) Static field in solr mapping - posted by "David Stuart (JIRA)" <ji...@apache.org> on 2011/04/17 11:22:05 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1460 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/18 06:03:55 UTC, 0 replies.
- [Nutch Wiki] Update of "Support" by lawrenceemily - posted by Apache Wiki <wi...@apache.org> on 2011/04/18 07:30:46 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "Support" by JulienNioche - posted by Apache Wiki <wi...@apache.org> on 2011/04/18 09:24:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/18 15:13:05 UTC, 2 replies.
- [jira] [Updated] (NUTCH-984) Parse-tika throws some URL's away - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/18 18:37:05 UTC, 0 replies.
- [jira] [Created] (NUTCH-984) Parse-tika throws some URL's away - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/18 18:37:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-984) Parse-tika throws some URL's away - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/18 18:52:05 UTC, 1 replies.
- [jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/04/18 19:06:06 UTC, 3 replies.
- Build failed in Jenkins: Nutch-trunk #1461 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/19 06:02:29 UTC, 0 replies.
- How could solrdedup work at all? - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/19 10:43:06 UTC, 0 replies.
- [jira] [Created] (NUTCH-985) Problems indexing lastModifiedDate in Solr - posted by "Dietrich Schmidt (JIRA)" <ji...@apache.org> on 2011/04/19 20:28:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-985) Problems indexing lastModifiedDate in Solr - posted by "Dietrich Schmidt (JIRA)" <ji...@apache.org> on 2011/04/19 20:35:06 UTC, 3 replies.
- [jira] [Commented] (NUTCH-985) Problems indexing lastModifiedDate in Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/19 20:41:05 UTC, 3 replies.
- Build failed in Jenkins: Nutch-trunk #1462 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2011/04/20 06:03:41 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1463 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/21 06:02:38 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1464 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/22 06:02:51 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-984) Parse-tika throws some URL's away - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2011/04/22 08:09:06 UTC, 0 replies.
- [DISCUSS] Nutch 1.3 RC - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/04/22 08:10:06 UTC, 0 replies.
- [VOTE] Apache Nutch 1.3 Release Candidate #1 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/04/22 08:48:10 UTC, 7 replies.
- Build failed in Jenkins: Nutch-trunk #1465 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/23 06:04:10 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1466 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/24 06:03:55 UTC, 0 replies.
- [Nutch Wiki] Update of "FrontPage" by sunlightcs - posted by Apache Wiki <wi...@apache.org> on 2011/04/24 09:37:05 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "FrontPage" by JulienNioche - posted by Apache Wiki <wi...@apache.org> on 2011/04/24 10:28:58 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-974) Parsing Error in Nutch 1.2 on Windows7 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/24 10:33:06 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1467 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/25 06:01:22 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1468 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/26 06:02:46 UTC, 0 replies.
- [jira] [Created] (NUTCH-986) Dedup fails due to date format (long) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/26 11:52:03 UTC, 0 replies.
- [jira] [Created] (NUTCH-987) Support HTTP auth for Solr communication - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/26 17:51:03 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-984) Parse-tika throws some URL's away - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/26 18:03:03 UTC, 0 replies.
- [jira] [Updated] (NUTCH-983) Upgrade SolrJ - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/26 19:25:03 UTC, 0 replies.
- [jira] [Updated] (NUTCH-986) Dedup fails due to date format (long) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/26 19:25:03 UTC, 4 replies.
- Build failed in Jenkins: Nutch-trunk #1469 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/27 06:04:20 UTC, 0 replies.
- [jira] [Created] (NUTCH-988) index-feed plugin also doesn't use proper date fields - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 13:00:03 UTC, 0 replies.
- [jira] [Updated] (NUTCH-987) Support HTTP auth for Solr communication - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 13:06:03 UTC, 0 replies.
- RE: Nutch Crawl aborted with out any Request - posted by ah...@accenture.com on 2011/04/27 13:13:36 UTC, 0 replies.
- [jira] [Created] (NUTCH-989) index-basic plugin also uses invalid date format for Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 13:14:03 UTC, 0 replies.
- [jira] [Updated] (NUTCH-985) MoreIndexingFilter doesn't use properly formatted date fields for Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 13:31:08 UTC, 1 replies.
- [jira] [Updated] (NUTCH-989) index-basic plugin doesn - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 13:41:03 UTC, 0 replies.
- [jira] [Updated] (NUTCH-989) index-basic plugin doesn't use Solr date fieldType - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 13:43:03 UTC, 0 replies.
- SolrDedup doesn't commit - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/27 16:16:01 UTC, 2 replies.
- [jira] [Commented] (NUTCH-983) Upgrade SolrJ - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 16:36:03 UTC, 4 replies.
- [jira] [Created] (NUTCH-990) protocol-httpclient fails with actually plain/text pages - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/04/27 16:36:03 UTC, 0 replies.
- [jira] [Commented] (NUTCH-990) protocol-httpclient fails with actually plain/text pages - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 16:52:03 UTC, 1 replies.
- [jira] [Updated] (NUTCH-990) protocol-httpclient fails with actually plain/text pages - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/04/27 17:00:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-985) MoreIndexingFilter doesn't use properly formatted date fields for Solr - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/27 17:02:03 UTC, 1 replies.
- [jira] [Created] (NUTCH-991) SolrDedup must issue a commit - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 17:21:03 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-991) SolrDedup must issue a commit - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 17:21:03 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-983) Upgrade SolrJ - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 17:29:03 UTC, 0 replies.
- [jira] [Updated] (NUTCH-979) Add support for deleting Solr documents with ProtocolStatusCodes.NOTFOUND - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 17:46:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-986) Dedup fails due to date format (long) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 18:12:03 UTC, 1 replies.
- [jira] [Updated] (NUTCH-990) protocol-httpclient fails with short pages - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/04/27 20:46:03 UTC, 1 replies.
- [jira] [Issue Comment Edited] (NUTCH-990) protocol-httpclient fails with short pages - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/04/27 20:48:03 UTC, 1 replies.
- [jira] [Commented] (NUTCH-990) protocol-httpclient fails with short pages - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 21:00:03 UTC, 6 replies.
- Build failed in Jenkins: Nutch-trunk #1470 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/28 06:03:46 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-986) Dedup fails due to date format (long) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/28 12:00:03 UTC, 0 replies.
- [jira] [Updated] (NUTCH-991) SolrDedup must issue a commit - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/28 13:17:03 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-991) SolrDedup must issue a commit - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/28 13:19:03 UTC, 0 replies.
- [jira] [Created] (NUTCH-992) SolrDedup is broken in trunk - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/28 13:25:03 UTC, 0 replies.
- Update schema to get solrdedup working again - posted by Markus Jelsma <ma...@openindex.io> on 2011/04/28 13:43:54 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1471 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/29 06:03:47 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1472 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/04/30 06:03:54 UTC, 0 replies.
- 1.3 RC2? - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/04/30 06:20:17 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-990) protocol-httpclient fails with short pages - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/30 12:50:03 UTC, 0 replies.