You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rafal Bluszcz Zawadzki <ra...@headnet.dk> on 2010/07/26 14:34:03 UTC

Problem with parsing date

Hi,

I am using Data Import Handler from Solr 1.4.

Parts of my data-config.xml are:


        <entity name="page"
                processor="XPathEntityProcessor"
                stream="false"
                forEach="/multistatus/response"
                url="/tmp/file.xml"

 transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer"
                >
.....

            <field column="modified"
 xpath="/multistatus/response/propstat/prop/getlastmodified"
dateTimeFormat="EEE, d MMM yyyy HH:mm:ss z" />
            <field column="CreationDate"
 xpath="/multistatus/response/propstat/prop/creationdate"
dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'"/>

During full-import I got message:

WARNING: Error creating document :
SolrInputDocument[{SearchableText=SearchableText(1.0)={phrase},
parentPaths=parentPaths(1.0)={/site},
review_state=review_state(1.0)={published}, created=created(1.0)={Sat Oct 11
14:38:27 CEST 2003}, UID=UID(1.0)={http://www.example.com:80/File-1563},
Title=Title(1.0)={This is only an example document},
portal_type=portal_type(1.0)={Document}, modified=modified(1.0)={Wed, 15 Jul
2009 08:23:34 GMT}}]
org.apache.solr.common.SolrException: Invalid Date String:'Wed, 15 Jul 2009
08:23:34 GMT'
at org.apache.solr.schema.DateField.parseMath(DateField.java:163)
at org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171)
at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94)
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)

Which as I understand, means that Solr / Java coudnt parse my date.

In my xml file it looks like:
<getlastmodified>Wed, 15 Jul 2009 08:23:34 GMT</getlastmodified>

In my opinion format "EEE, d MMM yyyy HH:mm:ss z" is correct, and what more
important - it was suppouse to work with same data week ago :)

Any idea will be appreciate.

-- 
Rafal Zawadzki
Backend developer

Re: Problem with parsing date

Posted by Rafal Bluszcz Zawadzki <ra...@headnet.dk>.
I have just fixed it.

Problem was related with operating system value - they were different that
solr expected with incoming datastream.

Regards,

Rafal Zawadzki

On Mon, Jul 26, 2010 at 3:20 PM, Chantal Ackermann <
chantal.ackermann@btelligent.de> wrote:

> On Mon, 2010-07-26 at 14:46 +0200, Rafal Bluszcz Zawadzki wrote:
> > EEE, d MMM yyyy HH:mm:ss z
>
> not sure but you might want to try with an uppercase 'Z' for the
> timezone (surrounded by single quotes, alternatively). The rest of your
> pattern looks fine. But if you still run into problems try different
> versions, like putting the comma in quotes etc.
>
> Cheers,
> Chantal
>
>
>

Re: Problem with parsing date

Posted by Chantal Ackermann <ch...@btelligent.de>.
On Mon, 2010-07-26 at 14:46 +0200, Rafal Bluszcz Zawadzki wrote:
> EEE, d MMM yyyy HH:mm:ss z

not sure but you might want to try with an uppercase 'Z' for the
timezone (surrounded by single quotes, alternatively). The rest of your
pattern looks fine. But if you still run into problems try different
versions, like putting the comma in quotes etc.

Cheers,
Chantal



Re: Problem with parsing date

Posted by Rafal Bluszcz Zawadzki <ra...@headnet.dk>.
I am using also others dateFormat string, also in same data handler and they
works. But not this one.

And this data are fetching from the external source, so I don't have
possibility to modify them (well, theoritacly i can save them, edit etc but
this is not the way). Why this is not working with SOLR?



On Mon, Jul 26, 2010 at 2:37 PM, Li Li <fa...@gmail.com> wrote:

> I uses format like yyyy-MM-ddThh:mm:ssZ. it works
>
> 2010/7/26 Rafal Bluszcz Zawadzki <ra...@headnet.dk>:
> > Hi,
> >
> > I am using Data Import Handler from Solr 1.4.
> >
> > Parts of my data-config.xml are:
> >
> >
> >        <entity name="page"
> >                processor="XPathEntityProcessor"
> >                stream="false"
> >                forEach="/multistatus/response"
> >                url="/tmp/file.xml"
> >
> >  transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer"
> >                >
> > .....
> >
> >            <field column="modified"
> >  xpath="/multistatus/response/propstat/prop/getlastmodified"
> > dateTimeFormat="EEE, d MMM yyyy HH:mm:ss z" />
> >            <field column="CreationDate"
> >  xpath="/multistatus/response/propstat/prop/creationdate"
> > dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'"/>
> >
> > During full-import I got message:
> >
> > WARNING: Error creating document :
> > SolrInputDocument[{SearchableText=SearchableText(1.0)={phrase},
> > parentPaths=parentPaths(1.0)={/site},
> > review_state=review_state(1.0)={published}, created=created(1.0)={Sat Oct
> 11
> > 14:38:27 CEST 2003}, UID=UID(1.0)={http://www.example.com:80/File-1563},
> > Title=Title(1.0)={This is only an example document},
> > portal_type=portal_type(1.0)={Document}, modified=modified(1.0)={Wed, 15
> Jul
> > 2009 08:23:34 GMT}}]
> > org.apache.solr.common.SolrException: Invalid Date String:'Wed, 15 Jul
> 2009
> > 08:23:34 GMT'
> > at org.apache.solr.schema.DateField.parseMath(DateField.java:163)
> > at
> org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171)
> > at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94)
> > at
> >
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246)
> > at
> >
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
> >
> > Which as I understand, means that Solr / Java coudnt parse my date.
> >
> > In my xml file it looks like:
> > <getlastmodified>Wed, 15 Jul 2009 08:23:34 GMT</getlastmodified>
> >
> > In my opinion format "EEE, d MMM yyyy HH:mm:ss z" is correct, and what
> more
> > important - it was suppouse to work with same data week ago :)
> >
> > Any idea will be appreciate.
> >
> > --
> > Rafal Zawadzki
> > Backend developer
> >
>

Re: Problem with parsing date

Posted by Li Li <fa...@gmail.com>.
I uses format like yyyy-MM-ddThh:mm:ssZ. it works

2010/7/26 Rafal Bluszcz Zawadzki <ra...@headnet.dk>:
> Hi,
>
> I am using Data Import Handler from Solr 1.4.
>
> Parts of my data-config.xml are:
>
>
>        <entity name="page"
>                processor="XPathEntityProcessor"
>                stream="false"
>                forEach="/multistatus/response"
>                url="/tmp/file.xml"
>
>  transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer"
>                >
> .....
>
>            <field column="modified"
>  xpath="/multistatus/response/propstat/prop/getlastmodified"
> dateTimeFormat="EEE, d MMM yyyy HH:mm:ss z" />
>            <field column="CreationDate"
>  xpath="/multistatus/response/propstat/prop/creationdate"
> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'"/>
>
> During full-import I got message:
>
> WARNING: Error creating document :
> SolrInputDocument[{SearchableText=SearchableText(1.0)={phrase},
> parentPaths=parentPaths(1.0)={/site},
> review_state=review_state(1.0)={published}, created=created(1.0)={Sat Oct 11
> 14:38:27 CEST 2003}, UID=UID(1.0)={http://www.example.com:80/File-1563},
> Title=Title(1.0)={This is only an example document},
> portal_type=portal_type(1.0)={Document}, modified=modified(1.0)={Wed, 15 Jul
> 2009 08:23:34 GMT}}]
> org.apache.solr.common.SolrException: Invalid Date String:'Wed, 15 Jul 2009
> 08:23:34 GMT'
> at org.apache.solr.schema.DateField.parseMath(DateField.java:163)
> at org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171)
> at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94)
> at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246)
> at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
>
> Which as I understand, means that Solr / Java coudnt parse my date.
>
> In my xml file it looks like:
> <getlastmodified>Wed, 15 Jul 2009 08:23:34 GMT</getlastmodified>
>
> In my opinion format "EEE, d MMM yyyy HH:mm:ss z" is correct, and what more
> important - it was suppouse to work with same data week ago :)
>
> Any idea will be appreciate.
>
> --
> Rafal Zawadzki
> Backend developer
>