You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by WORK Gaétan QUENTIN <wo...@gmail.com> on 2021/05/15 09:42:23 UTC

Date string field not recognized...

Hi,


  I have a problem with dateformat not recognized in ods document:

Environment:

------------------


ubuntu 20.04, lxd container

openjdk version "14.0.2" 2020-07-14

solr 8.8.2


Problem:

----------

Indexer fails with a LibreOffice document .ods file:

SimplePostTool: WARNING: Response: {
"responseHeader":{
"status":400,
"QTime":552},
"error":{
"metadata":[
"error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.common.SolrException"],
"msg":"ERROR: [doc=XXX.ods] Error adding field 
'last_modified'='2019-01-08T00:22:05.772138594' msg=Invalid Date 
String:'2019-01-08T00:22:05.772138594'",
"code":400}}

But i don't understand why: the date format looks ok for me.

The doc says that subseconds would be truncated if too long, and Z for 
utc is not mandatory isn't it?


And question 2: how to tell solr to turn an invalid date format into a 
valid one on the fly, or to recognize a new format?


Regards,

Gaétan


Re: Date string field not recognized...

Posted by Walter Underwood <wu...@wunderwood.org>.
You could write an update request processor script to add a ‘Z’ to the end of that field.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On May 24, 2021, at 2:15 PM, Gaétan QUENTIN@Work <wo...@gmail.com> wrote:
> 
> I think it is because 'Z' is missing: i thought it was an option, but finaly is mandatory.  So how to tell solr to accept date without 'Z'?
> 
> 
> Le 24/05/2021 à 22:59, Gaétan QUENTIN@Work a écrit :
>> Thanks for your answer.
>> 
>> It looks like quite complicated to add a custom UpdateRequestProcessor into sample_techproducts_configs type core.
>> 
>> Are there tools to help to do this ?
>> 
>> Lot of dates  are said invalid by solr but i don't see why. Most of docs are .ods / .odt :
>> 
>> msg=Invalid Date String:'2011-02-13T20:44:16'
>> msg=Invalid Date String:'2015-11-25T06:55:49.316556151'
>> msg=Invalid Date String:'2020-01-15T18:31:12.132601079'
>> msg=Invalid Date String:'2019-12-09T01:16:41.920883407'
>> msg=Invalid Date String:'2006-06-05T01:54:39'
>> msg=Invalid Date String:'2006-04-15T08:55:56'
>> msg=Invalid Date String:'2013-07-21T23:08:21'
>> msg=Invalid Date String:'2020-09-23T14:11:34.397265987'
>> msg=Invalid Date String:'2021-03-05T08:00:52.763074287'
>> msg=Invalid Date String:'2019-01-08T00:22:05.772138594'
>> msg=Invalid Date String:'2012-07-10T13:11:32'
>> msg=Invalid Date String:'2010-05-29T17:48:35'
>> msg=Invalid Date String:'2016-05-16T11:59:55.489935279'
>> msg=Invalid Date String:'2007-02-07T00:36:22'
>> msg=Invalid Date String:'2007-02-06T20:43:42'
>> msg=Invalid Date String:'2019-05-09T03:25:30.090833759'
>> msg=Invalid Date String:'2013-03-02T01:03:42'
>> msg=Invalid Date String:'2013-02-28T23:03:47'
>> msg=Invalid Date String:'2013-03-01T00:27:39'
>> msg=Invalid Date String:'2013-03-02T16:04:47'
>> msg=Invalid Date String:'2013-04-18T13:48:12'
>> msg=Invalid Date String:'2009-05-18T12:46:23'
>> msg=Invalid Date String:'2010-12-03T01:13:07'
>> 
>> 
>> Regards,
>> 
>> Gaétan
>> Le 15/05/2021 à 15:05, Alexandre Rafalovitch a écrit :
>>> Not sure why the date is not recognized. But to parse alternative formats,
>>> you can create a custom UpdateRequestProcessor with a number of formats to
>>> accept.
>>> 
>>> That's part of how "schema less" mode works, you can explore that by
>>> checking solrconfig.xml
>>> 
>>> Regards,
>>>     Alex
>>> 
>>> On Sat., May 15, 2021, 5:42 a.m. WORK Gaétan QUENTIN, <
>>> work.gaetan.quentin@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> 
>>>>    I have a problem with dateformat not recognized in ods document:
>>>> 
>>>> Environment:
>>>> 
>>>> ------------------
>>>> 
>>>> 
>>>> ubuntu 20.04, lxd container
>>>> 
>>>> openjdk version "14.0.2" 2020-07-14
>>>> 
>>>> solr 8.8.2
>>>> 
>>>> 
>>>> Problem:
>>>> 
>>>> ----------
>>>> 
>>>> Indexer fails with a LibreOffice document .ods file:
>>>> 
>>>> SimplePostTool: WARNING: Response: {
>>>> "responseHeader":{
>>>> "status":400,
>>>> "QTime":552},
>>>> "error":{
>>>> "metadata":[
>>>> "error-class","org.apache.solr.common.SolrException",
>>>> "root-error-class","org.apache.solr.common.SolrException"],
>>>> "msg":"ERROR: [doc=XXX.ods] Error adding field
>>>> 'last_modified'='2019-01-08T00:22:05.772138594' msg=Invalid Date
>>>> String:'2019-01-08T00:22:05.772138594'",
>>>> "code":400}}
>>>> 
>>>> But i don't understand why: the date format looks ok for me.
>>>> 
>>>> The doc says that subseconds would be truncated if too long, and Z for
>>>> utc is not mandatory isn't it?
>>>> 
>>>> 
>>>> And question 2: how to tell solr to turn an invalid date format into a
>>>> valid one on the fly, or to recognize a new format?
>>>> 
>>>> 
>>>> Regards,
>>>> 
>>>> Gaétan
>>>> 
>>>> 


Re: Date string field not recognized...

Posted by Gaétan QU...@Work, wo...@gmail.com.
I think it is because 'Z' is missing: i thought it was an option, but 
finaly is mandatory.  So how to tell solr to accept date without 'Z'?


Le 24/05/2021 à 22:59, Gaétan QUENTIN@Work a écrit :
> Thanks for your answer.
>
> It looks like quite complicated to add a custom UpdateRequestProcessor 
> into sample_techproducts_configs type core.
>
> Are there tools to help to do this ?
>
> Lot of dates  are said invalid by solr but i don't see why. Most of 
> docs are .ods / .odt :
>
> msg=Invalid Date String:'2011-02-13T20:44:16'
> msg=Invalid Date String:'2015-11-25T06:55:49.316556151'
> msg=Invalid Date String:'2020-01-15T18:31:12.132601079'
> msg=Invalid Date String:'2019-12-09T01:16:41.920883407'
> msg=Invalid Date String:'2006-06-05T01:54:39'
> msg=Invalid Date String:'2006-04-15T08:55:56'
> msg=Invalid Date String:'2013-07-21T23:08:21'
> msg=Invalid Date String:'2020-09-23T14:11:34.397265987'
> msg=Invalid Date String:'2021-03-05T08:00:52.763074287'
> msg=Invalid Date String:'2019-01-08T00:22:05.772138594'
> msg=Invalid Date String:'2012-07-10T13:11:32'
> msg=Invalid Date String:'2010-05-29T17:48:35'
> msg=Invalid Date String:'2016-05-16T11:59:55.489935279'
> msg=Invalid Date String:'2007-02-07T00:36:22'
> msg=Invalid Date String:'2007-02-06T20:43:42'
> msg=Invalid Date String:'2019-05-09T03:25:30.090833759'
> msg=Invalid Date String:'2013-03-02T01:03:42'
> msg=Invalid Date String:'2013-02-28T23:03:47'
> msg=Invalid Date String:'2013-03-01T00:27:39'
> msg=Invalid Date String:'2013-03-02T16:04:47'
> msg=Invalid Date String:'2013-04-18T13:48:12'
> msg=Invalid Date String:'2009-05-18T12:46:23'
> msg=Invalid Date String:'2010-12-03T01:13:07'
>
>
> Regards,
>
> Gaétan
> Le 15/05/2021 à 15:05, Alexandre Rafalovitch a écrit :
>> Not sure why the date is not recognized. But to parse alternative 
>> formats,
>> you can create a custom UpdateRequestProcessor with a number of 
>> formats to
>> accept.
>>
>> That's part of how "schema less" mode works, you can explore that by
>> checking solrconfig.xml
>>
>> Regards,
>>     Alex
>>
>> On Sat., May 15, 2021, 5:42 a.m. WORK Gaétan QUENTIN, <
>> work.gaetan.quentin@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>
>>>    I have a problem with dateformat not recognized in ods document:
>>>
>>> Environment:
>>>
>>> ------------------
>>>
>>>
>>> ubuntu 20.04, lxd container
>>>
>>> openjdk version "14.0.2" 2020-07-14
>>>
>>> solr 8.8.2
>>>
>>>
>>> Problem:
>>>
>>> ----------
>>>
>>> Indexer fails with a LibreOffice document .ods file:
>>>
>>> SimplePostTool: WARNING: Response: {
>>> "responseHeader":{
>>> "status":400,
>>> "QTime":552},
>>> "error":{
>>> "metadata":[
>>> "error-class","org.apache.solr.common.SolrException",
>>> "root-error-class","org.apache.solr.common.SolrException"],
>>> "msg":"ERROR: [doc=XXX.ods] Error adding field
>>> 'last_modified'='2019-01-08T00:22:05.772138594' msg=Invalid Date
>>> String:'2019-01-08T00:22:05.772138594'",
>>> "code":400}}
>>>
>>> But i don't understand why: the date format looks ok for me.
>>>
>>> The doc says that subseconds would be truncated if too long, and Z for
>>> utc is not mandatory isn't it?
>>>
>>>
>>> And question 2: how to tell solr to turn an invalid date format into a
>>> valid one on the fly, or to recognize a new format?
>>>
>>>
>>> Regards,
>>>
>>> Gaétan
>>>
>>>

Re: Date string field not recognized...

Posted by Gaétan QU...@Work, wo...@gmail.com.
Thanks for your answer.

It looks like quite complicated to add a custom UpdateRequestProcessor 
into sample_techproducts_configs type core.

Are there tools to help to do this ?

Lot of dates  are said invalid by solr but i don't see why. Most of docs 
are .ods / .odt :

msg=Invalid Date String:'2011-02-13T20:44:16'
msg=Invalid Date String:'2015-11-25T06:55:49.316556151'
msg=Invalid Date String:'2020-01-15T18:31:12.132601079'
msg=Invalid Date String:'2019-12-09T01:16:41.920883407'
msg=Invalid Date String:'2006-06-05T01:54:39'
msg=Invalid Date String:'2006-04-15T08:55:56'
msg=Invalid Date String:'2013-07-21T23:08:21'
msg=Invalid Date String:'2020-09-23T14:11:34.397265987'
msg=Invalid Date String:'2021-03-05T08:00:52.763074287'
msg=Invalid Date String:'2019-01-08T00:22:05.772138594'
msg=Invalid Date String:'2012-07-10T13:11:32'
msg=Invalid Date String:'2010-05-29T17:48:35'
msg=Invalid Date String:'2016-05-16T11:59:55.489935279'
msg=Invalid Date String:'2007-02-07T00:36:22'
msg=Invalid Date String:'2007-02-06T20:43:42'
msg=Invalid Date String:'2019-05-09T03:25:30.090833759'
msg=Invalid Date String:'2013-03-02T01:03:42'
msg=Invalid Date String:'2013-02-28T23:03:47'
msg=Invalid Date String:'2013-03-01T00:27:39'
msg=Invalid Date String:'2013-03-02T16:04:47'
msg=Invalid Date String:'2013-04-18T13:48:12'
msg=Invalid Date String:'2009-05-18T12:46:23'
msg=Invalid Date String:'2010-12-03T01:13:07'


Regards,

Gaétan
Le 15/05/2021 à 15:05, Alexandre Rafalovitch a écrit :
> Not sure why the date is not recognized. But to parse alternative formats,
> you can create a custom UpdateRequestProcessor with a number of formats to
> accept.
>
> That's part of how "schema less" mode works, you can explore that by
> checking solrconfig.xml
>
> Regards,
>     Alex
>
> On Sat., May 15, 2021, 5:42 a.m. WORK Gaétan QUENTIN, <
> work.gaetan.quentin@gmail.com> wrote:
>
>> Hi,
>>
>>
>>    I have a problem with dateformat not recognized in ods document:
>>
>> Environment:
>>
>> ------------------
>>
>>
>> ubuntu 20.04, lxd container
>>
>> openjdk version "14.0.2" 2020-07-14
>>
>> solr 8.8.2
>>
>>
>> Problem:
>>
>> ----------
>>
>> Indexer fails with a LibreOffice document .ods file:
>>
>> SimplePostTool: WARNING: Response: {
>> "responseHeader":{
>> "status":400,
>> "QTime":552},
>> "error":{
>> "metadata":[
>> "error-class","org.apache.solr.common.SolrException",
>> "root-error-class","org.apache.solr.common.SolrException"],
>> "msg":"ERROR: [doc=XXX.ods] Error adding field
>> 'last_modified'='2019-01-08T00:22:05.772138594' msg=Invalid Date
>> String:'2019-01-08T00:22:05.772138594'",
>> "code":400}}
>>
>> But i don't understand why: the date format looks ok for me.
>>
>> The doc says that subseconds would be truncated if too long, and Z for
>> utc is not mandatory isn't it?
>>
>>
>> And question 2: how to tell solr to turn an invalid date format into a
>> valid one on the fly, or to recognize a new format?
>>
>>
>> Regards,
>>
>> Gaétan
>>
>>

Re: Date string field not recognized...

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Not sure why the date is not recognized. But to parse alternative formats,
you can create a custom UpdateRequestProcessor with a number of formats to
accept.

That's part of how "schema less" mode works, you can explore that by
checking solrconfig.xml

Regards,
   Alex

On Sat., May 15, 2021, 5:42 a.m. WORK Gaétan QUENTIN, <
work.gaetan.quentin@gmail.com> wrote:

> Hi,
>
>
>   I have a problem with dateformat not recognized in ods document:
>
> Environment:
>
> ------------------
>
>
> ubuntu 20.04, lxd container
>
> openjdk version "14.0.2" 2020-07-14
>
> solr 8.8.2
>
>
> Problem:
>
> ----------
>
> Indexer fails with a LibreOffice document .ods file:
>
> SimplePostTool: WARNING: Response: {
> "responseHeader":{
> "status":400,
> "QTime":552},
> "error":{
> "metadata":[
> "error-class","org.apache.solr.common.SolrException",
> "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"ERROR: [doc=XXX.ods] Error adding field
> 'last_modified'='2019-01-08T00:22:05.772138594' msg=Invalid Date
> String:'2019-01-08T00:22:05.772138594'",
> "code":400}}
>
> But i don't understand why: the date format looks ok for me.
>
> The doc says that subseconds would be truncated if too long, and Z for
> utc is not mandatory isn't it?
>
>
> And question 2: how to tell solr to turn an invalid date format into a
> valid one on the fly, or to recognize a new format?
>
>
> Regards,
>
> Gaétan
>
>