You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/27 13:00:03 UTC
[jira] [Created] (NUTCH-988) index-feed plugin also doesn't use
proper date fields
index-feed plugin also doesn't use proper date fields
-----------------------------------------------------
Key: NUTCH-988
URL: https://issues.apache.org/jira/browse/NUTCH-988
Project: Nutch
Issue Type: Improvement
Affects Versions: 1.3, 2.0
Reporter: Markus Jelsma
Priority: Minor
Fix For: 2.0
Like some other fields, the date fields generated by the feed-plugin are not using the proper date format for Solr.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: Nutch Crawl aborted with out any Request
Posted by ah...@accenture.com.
Hallo List ,
I have one question please ,
I tried to Crawl some URLs but after 45Min was the Crawling aborted and not completely and with Out any request !!
Here is the last session from the Log Out file:
INFO plugin.PluginRepository - Plugin Auto-activation mode: [true]
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Registered Plugins:
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Basic Query Filter (query-basic)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Basic URL Normalizer (urlnormalizer-basic)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Html Parse Plug-in (parse-html)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Basic Indexing Filter (index-basic)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Pdf Parse Plug-in (parse-pdf)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Site Query Filter (query-site)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Http / Https Protocol Plug-in (protocol-httpclient)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - HTTP Framework (lib-http)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Regex URL Filter (urlfilter-regex)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Pass-through URL Normalizer (urlnormalizer-pass)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Regex URL Normalizer (urlnormalizer-regex)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - OPIC Scoring Plug-in (scoring-opic)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Tika Parser Plug-in (parse-tika)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - CyberNeko HTML Parser (lib-nekohtml)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Anchor Indexing Filter (index-anchor)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - SWF Parse Plug-in (parse-swf)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - URL Query Filter (query-url)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Regex URL Filter Framework (lib-regex-filter)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Registered Extension-Points:
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Nutch Summarizer (org.apache.nutch.searcher.Summarizer)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Nutch Protocol (org.apache.nutch.protocol.Protocol)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Nutch Segment Merge Filter (org.apache.nutch.segment.SegmentMergeFilter)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer)
2011-04-26 23:55:45,330 INFO plugin.PluginRepository - Nutch Field Filter (org.apache.nutch.indexer.field.FieldFilter)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Nutch Query Filter (org.apache.nutch.searcher.QueryFilter)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Nutch Search Results Response Writer (org.apache.nutch.searcher.response.ResponseWriter)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Nutch URL Filter (org.apache.nutch.net.URLFilter)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Nutch Online Search Results Clustering Plugin (org.apache.nutch.clustering.OnlineClusterer)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Nutch Content Parser (org.apache.nutch.parse.Parser)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Nutch Scoring (org.apache.nutch.scoring.ScoringFilter)
2011-04-26 23:55:45,331 INFO plugin.PluginRepository - Ontology Model Loader (org.apache.nutch.ontology.Ontology)
2011-04-26 23:55:45,334 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2011-04-26 23:55:45,347 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
Thnx for any Help .
Amed
This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the email by you is prohibited.
[jira] [Closed] (NUTCH-988) index-feed plugin also doesn't use
proper date fields
Posted by "Julien Nioche (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche closed NUTCH-988.
-------------------------------
Resolution: Fixed
Fixed as part of https://issues.apache.org/jira/browse/NUTCH-999
> index-feed plugin also doesn't use proper date fields
> -----------------------------------------------------
>
> Key: NUTCH-988
> URL: https://issues.apache.org/jira/browse/NUTCH-988
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 1.3, 2.0
> Reporter: Markus Jelsma
> Priority: Minor
> Fix For: 2.0
>
>
> Like some other fields, the date fields generated by the feed-plugin are not using the proper date format for Solr.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira