You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by KRIS MUSSHORN <mu...@comcast.net> on 2016/11/17 18:55:15 UTC

field set up help

I have a field in solr 5.4.1 that has values like: 
2016-10-15 
2016-09-10 
2015-10-12 
2010-09-02 
  
Yes it is a date being stored as text. 
  
I am getting the data onto solr via nutch and the metatag plug in. 
  
The data is coming directly from the website I am crawling and I am not able to change the data at the source to something more palpable. 
  
The field is set in solr to be of type TextField that is indexed, tokenized, stored, multivalued and norms are omitted. 
  
Both the index and query analysis chains contain just the whitespace tokenizer factory and the lowercase filter factory. 
  
I need to be able to query for 2016-10 and only match 2016-10-15. 
  
Any ideas on how to set this up? 
  
TIA 
  
Kris  
  

Re: field set up help

Posted by Shawn Heisey <ap...@elyograg.org>.
On 11/17/2016 11:55 AM, KRIS MUSSHORN wrote:
> I have a field in solr 5.4.1 that has values like: 
> 2016-10-15 
> 2016-09-10 
> 2015-10-12 
> 2010-09-02 
>   
> Yes it is a date being stored as text. 
<snip>
> I need to be able to query for 2016-10 and only match 2016-10-15. 

I think your best bet is to use DateRangeField instead of TextField. 
You will need to reindex after changing your schema.

https://lucidworks.com/blog/2016/02/13/solrs-daterangefield-perform/
https://cwiki.apache.org/confluence/display/solr/Working+with+Dates#WorkingwithDates-DateRangeFormatting

Thanks,
Shawn


Re: field set up help

Posted by Will Martin <wm...@outlook.com>.
don't give up yet kris.

q={!prefix f=metatag.date}2016-10&debugQuery

g'luck

will

On 11/17/2016 5:56 PM, Kris Musshorn wrote:

This q={!prefix f=metatag.date}2016-10 returns zero records

-----Original Message-----
From: KRIS MUSSHORN [mailto:musshorns@comcast.net]
Sent: Thursday, November 17, 2016 3:00 PM
To: solr-user@lucene.apache.org<ma...@lucene.apache.org>
Subject: Re: field set up help

so if the field was named metatag.date q={!prefix f=metatag.date}2016-10....

----- Original Message -----

From: "Erik Hatcher" <er...@gmail.com>
To: solr-user@lucene.apache.org<ma...@lucene.apache.org>
Sent: Thursday, November 17, 2016 2:46:32 PM
Subject: Re: field set up help

Given what you’ve said, my hunch is you could make the query like this:

    q={!prefix f=field_name}2016-10

tada!  ?!

there’s nothing wrong with indexing dates as text like that, as long as your queries are performantly possible.   And in the case of the query type you mentioned, the text/string’ish indexing you’ve done is suited quite well to prefix queries to grab dates by year, year-month, and year-month-day.   But you could, if needed to get more sophisticated with date queries (DateRangeField is my new favorite) you can leverage ParseDateFieldUpdateProcessorFactory without having to change the incoming format.

        Erik






On Nov 17, 2016, at 1:55 PM, KRIS MUSSHORN <mu...@comcast.net> wrote:


I have a field in solr 5.4.1 that has values like:
2016-10-15
2016-09-10
2015-10-12
2010-09-02

Yes it is a date being stored as text.

I am getting the data onto solr via nutch and the metatag plug in.

The data is coming directly from the website I am crawling and I am not able to change the data at the source to something more palpable.

The field is set in solr to be of type TextField that is indexed, tokenized, stored, multivalued and norms are omitted.

Both the index and query analysis chains contain just the whitespace tokenizer factory and the lowercase filter factory.

I need to be able to query for 2016-10 and only match 2016-10-15.

Any ideas on how to set this up?

TIA

Kris









Re: field set up help

Posted by Erick Erickson <er...@gmail.com>.
because you have this as an analyzed field rather than a string field I think.

Best,
Erick

On Thu, Nov 17, 2016 at 2:56 PM, Kris Musshorn <mu...@comcast.net> wrote:
> This q={!prefix f=metatag.date}2016-10 returns zero records
>
> -----Original Message-----
> From: KRIS MUSSHORN [mailto:musshorns@comcast.net]
> Sent: Thursday, November 17, 2016 3:00 PM
> To: solr-user@lucene.apache.org
> Subject: Re: field set up help
>
> so if the field was named metatag.date q={!prefix f=metatag.date}2016-10....
>
> ----- Original Message -----
>
> From: "Erik Hatcher" <er...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Thursday, November 17, 2016 2:46:32 PM
> Subject: Re: field set up help
>
> Given what you’ve said, my hunch is you could make the query like this:
>
>     q={!prefix f=field_name}2016-10
>
> tada!  ?!
>
> there’s nothing wrong with indexing dates as text like that, as long as your queries are performantly possible.   And in the case of the query type you mentioned, the text/string’ish indexing you’ve done is suited quite well to prefix queries to grab dates by year, year-month, and year-month-day.   But you could, if needed to get more sophisticated with date queries (DateRangeField is my new favorite) you can leverage ParseDateFieldUpdateProcessorFactory without having to change the incoming format.
>
>         Erik
>
>
>
>
>> On Nov 17, 2016, at 1:55 PM, KRIS MUSSHORN <mu...@comcast.net> wrote:
>>
>>
>> I have a field in solr 5.4.1 that has values like:
>> 2016-10-15
>> 2016-09-10
>> 2015-10-12
>> 2010-09-02
>>
>> Yes it is a date being stored as text.
>>
>> I am getting the data onto solr via nutch and the metatag plug in.
>>
>> The data is coming directly from the website I am crawling and I am not able to change the data at the source to something more palpable.
>>
>> The field is set in solr to be of type TextField that is indexed, tokenized, stored, multivalued and norms are omitted.
>>
>> Both the index and query analysis chains contain just the whitespace tokenizer factory and the lowercase filter factory.
>>
>> I need to be able to query for 2016-10 and only match 2016-10-15.
>>
>> Any ideas on how to set this up?
>>
>> TIA
>>
>> Kris
>>
>
>
>

Re: field set up help

Posted by Comcast <mu...@comcast.net>.
Perfect. Just had to wrap the pho curl request URL with urlencode and it worked

Sent from my iPhone

> On Nov 17, 2016, at 5:56 PM, Kris Musshorn <mu...@comcast.net> wrote:
> 
> This q={!prefix f=metatag.date}2016-10 returns zero records
> 
> -----Original Message-----
> From: KRIS MUSSHORN [mailto:musshorns@comcast.net] 
> Sent: Thursday, November 17, 2016 3:00 PM
> To: solr-user@lucene.apache.org
> Subject: Re: field set up help
> 
> so if the field was named metatag.date q={!prefix f=metatag.date}2016-10.... 
> 
> ----- Original Message -----
> 
> From: "Erik Hatcher" <er...@gmail.com> 
> To: solr-user@lucene.apache.org 
> Sent: Thursday, November 17, 2016 2:46:32 PM 
> Subject: Re: field set up help 
> 
> Given what you’ve said, my hunch is you could make the query like this: 
> 
>    q={!prefix f=field_name}2016-10 
> 
> tada!  ?! 
> 
> there’s nothing wrong with indexing dates as text like that, as long as your queries are performantly possible.   And in the case of the query type you mentioned, the text/string’ish indexing you’ve done is suited quite well to prefix queries to grab dates by year, year-month, and year-month-day.   But you could, if needed to get more sophisticated with date queries (DateRangeField is my new favorite) you can leverage ParseDateFieldUpdateProcessorFactory without having to change the incoming format. 
> 
>        Erik 
> 
> 
> 
> 
>> On Nov 17, 2016, at 1:55 PM, KRIS MUSSHORN <mu...@comcast.net> wrote: 
>> 
>> 
>> I have a field in solr 5.4.1 that has values like: 
>> 2016-10-15 
>> 2016-09-10 
>> 2015-10-12 
>> 2010-09-02 
>> 
>> Yes it is a date being stored as text. 
>> 
>> I am getting the data onto solr via nutch and the metatag plug in. 
>> 
>> The data is coming directly from the website I am crawling and I am not able to change the data at the source to something more palpable. 
>> 
>> The field is set in solr to be of type TextField that is indexed, tokenized, stored, multivalued and norms are omitted. 
>> 
>> Both the index and query analysis chains contain just the whitespace tokenizer factory and the lowercase filter factory. 
>> 
>> I need to be able to query for 2016-10 and only match 2016-10-15. 
>> 
>> Any ideas on how to set this up? 
>> 
>> TIA 
>> 
>> Kris   
>> 
> 
> 
> 


RE: field set up help

Posted by Kris Musshorn <mu...@comcast.net>.
This q={!prefix f=metatag.date}2016-10 returns zero records

-----Original Message-----
From: KRIS MUSSHORN [mailto:musshorns@comcast.net] 
Sent: Thursday, November 17, 2016 3:00 PM
To: solr-user@lucene.apache.org
Subject: Re: field set up help

so if the field was named metatag.date q={!prefix f=metatag.date}2016-10.... 

----- Original Message -----

From: "Erik Hatcher" <er...@gmail.com> 
To: solr-user@lucene.apache.org 
Sent: Thursday, November 17, 2016 2:46:32 PM 
Subject: Re: field set up help 

Given what you’ve said, my hunch is you could make the query like this: 

    q={!prefix f=field_name}2016-10 

tada!  ?! 

there’s nothing wrong with indexing dates as text like that, as long as your queries are performantly possible.   And in the case of the query type you mentioned, the text/string’ish indexing you’ve done is suited quite well to prefix queries to grab dates by year, year-month, and year-month-day.   But you could, if needed to get more sophisticated with date queries (DateRangeField is my new favorite) you can leverage ParseDateFieldUpdateProcessorFactory without having to change the incoming format. 

        Erik 




> On Nov 17, 2016, at 1:55 PM, KRIS MUSSHORN <mu...@comcast.net> wrote: 
> 
> 
> I have a field in solr 5.4.1 that has values like: 
> 2016-10-15 
> 2016-09-10 
> 2015-10-12 
> 2010-09-02 
>   
> Yes it is a date being stored as text. 
>   
> I am getting the data onto solr via nutch and the metatag plug in. 
>   
> The data is coming directly from the website I am crawling and I am not able to change the data at the source to something more palpable. 
>   
> The field is set in solr to be of type TextField that is indexed, tokenized, stored, multivalued and norms are omitted. 
>   
> Both the index and query analysis chains contain just the whitespace tokenizer factory and the lowercase filter factory. 
>   
> I need to be able to query for 2016-10 and only match 2016-10-15. 
>   
> Any ideas on how to set this up? 
>   
> TIA 
>   
> Kris   
>   




Re: field set up help

Posted by KRIS MUSSHORN <mu...@comcast.net>.
so if the field was named metatag.date q={!prefix f=metatag.date}2016-10.... 

----- Original Message -----

From: "Erik Hatcher" <er...@gmail.com> 
To: solr-user@lucene.apache.org 
Sent: Thursday, November 17, 2016 2:46:32 PM 
Subject: Re: field set up help 

Given what you’ve said, my hunch is you could make the query like this: 

    q={!prefix f=field_name}2016-10 

tada!  ?! 

there’s nothing wrong with indexing dates as text like that, as long as your queries are performantly possible.   And in the case of the query type you mentioned, the text/string’ish indexing you’ve done is suited quite well to prefix queries to grab dates by year, year-month, and year-month-day.   But you could, if needed to get more sophisticated with date queries (DateRangeField is my new favorite) you can leverage ParseDateFieldUpdateProcessorFactory without having to change the incoming format. 

        Erik 




> On Nov 17, 2016, at 1:55 PM, KRIS MUSSHORN <mu...@comcast.net> wrote: 
> 
> 
> I have a field in solr 5.4.1 that has values like: 
> 2016-10-15 
> 2016-09-10 
> 2015-10-12 
> 2010-09-02 
>   
> Yes it is a date being stored as text. 
>   
> I am getting the data onto solr via nutch and the metatag plug in. 
>   
> The data is coming directly from the website I am crawling and I am not able to change the data at the source to something more palpable. 
>   
> The field is set in solr to be of type TextField that is indexed, tokenized, stored, multivalued and norms are omitted. 
>   
> Both the index and query analysis chains contain just the whitespace tokenizer factory and the lowercase filter factory. 
>   
> I need to be able to query for 2016-10 and only match 2016-10-15. 
>   
> Any ideas on how to set this up? 
>   
> TIA 
>   
> Kris   
>   



Re: field set up help

Posted by Erik Hatcher <er...@gmail.com>.
Given what you’ve said, my hunch is you could make the query like this:

    q={!prefix f=field_name}2016-10

tada!  ?!

there’s nothing wrong with indexing dates as text like that, as long as your queries are performantly possible.   And in the case of the query type you mentioned, the text/string’ish indexing you’ve done is suited quite well to prefix queries to grab dates by year, year-month, and year-month-day.   But you could, if needed to get more sophisticated with date queries (DateRangeField is my new favorite) you can leverage ParseDateFieldUpdateProcessorFactory without having to change the incoming format.

	Erik




> On Nov 17, 2016, at 1:55 PM, KRIS MUSSHORN <mu...@comcast.net> wrote:
> 
> 
> I have a field in solr 5.4.1 that has values like: 
> 2016-10-15 
> 2016-09-10 
> 2015-10-12 
> 2010-09-02 
>   
> Yes it is a date being stored as text. 
>   
> I am getting the data onto solr via nutch and the metatag plug in. 
>   
> The data is coming directly from the website I am crawling and I am not able to change the data at the source to something more palpable. 
>   
> The field is set in solr to be of type TextField that is indexed, tokenized, stored, multivalued and norms are omitted. 
>   
> Both the index and query analysis chains contain just the whitespace tokenizer factory and the lowercase filter factory. 
>   
> I need to be able to query for 2016-10 and only match 2016-10-15. 
>  
> Any ideas on how to set this up? 
>   
> TIA 
>   
> Kris  
>