You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/04/27 17:02:03 UTC
[jira] [Commented] (NUTCH-985) MoreIndexingFilter doesn't use
properly formatted date fields for Solr
[ https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025827#comment-13025827 ]
Julien Nioche commented on NUTCH-985:
-------------------------------------
SOLR is currently our current indexer but this might not be the case forever and we could have other backends like ElasticSearch which would expect a different format. For this reason I'd rather we stored Date objects in the various IndexingFilters implementations and do the SOLR-specific formatting in the SOLR indexer.
> MoreIndexingFilter doesn't use properly formatted date fields for Solr
> ----------------------------------------------------------------------
>
> Key: NUTCH-985
> URL: https://issues.apache.org/jira/browse/NUTCH-985
> Project: Nutch
> Issue Type: Bug
> Components: indexer
> Affects Versions: 1.3, 2.0
> Reporter: Dietrich Schmidt
> Assignee: Markus Jelsma
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-985-trunk-1.patch, NUTCH-985.1.3-1.patch, indexlastmodifieddate.jar
>
>
> I am using the index-more plugin to parse the lastModified data in web
> pages in order to store it in a Solr data field.
> In solrindex-mapping.xml I am mapping lastModified to a field "changed" in Solr:
> <field dest="changed" source="lastModified"/>
> However, when posting data to Solr the SolrIndexer posts it as a long,
> not as a date:
> <add><doc boost="1.0"><field
> name="changed">1079326800000</field><field
> name="tstamp">20110414144140188</field><field
> name="date">20040315</field>
> Solr rejects the data because of the improper data type.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira