You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org> on 2008/08/29 23:09:44 UTC

[jira] Created: (SOLR-741) Add support for rounding dates in DateField

Add support for rounding dates in DateField
-------------------------------------------

                 Key: SOLR-741
                 URL: https://issues.apache.org/jira/browse/SOLR-741
             Project: Solr
          Issue Type: Improvement
          Components: search
    Affects Versions: 1.4
            Reporter: Shalin Shekhar Mangar
            Priority: Minor
             Fix For: 1.4


As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html

Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
# <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
# <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)

Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.

This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627176#action_12627176 ] 

Hoss Man commented on SOLR-741:
-------------------------------

I propose a lot of "more general" things -- but I'm also a fan of simple, direct, specific enhancements to solve common problems.  I'm on board with adding support for something like this directly to DateField.

Reusing the DateMathParser syntax makes a lot of sense -- it has a lot of flexibility and should already be familiar to people doing non trivial things with DateField.  Calling it "round" or "roundTo" seems like it would pigeon hole it a bit ... perhaps "forceMath" or "appendMath" or "mutate" or something that better conveys the idea of "general modification made to all dates"

The downsides: 
# it has no simple syntax for "round up" but it can be expressed somewhat verbosely ("+1DAY-1MILLI/DAY" rounds up to the nearest day) 
# it has no notion of "round to the nearest 5 minutes" which some people might expect

...but honestly, those could easily be added as new features to DateMathParser  -- and then they'd benefit this issue as well as general Date Math usages in queries (like date faceting)

syntax wise: perhaps "\FOO" could be the round up equivalent of "/FOO" ? ... with "/nFOO" and "\nFOO" being the "round down/up to the nearest nth value for unit FOO" ?

> Add support for rounding dates in DateField
> -------------------------------------------
>
>                 Key: SOLR-741
>                 URL: https://issues.apache.org/jira/browse/SOLR-741
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>
> As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
> Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)
> Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.
> This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-741) Add support for rounding dates in DateField

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-741:
---------------------------------------

    Fix Version/s:     (was: 1.4)
                   1.5

Deferring to 1.5 -- with trie support coming in, this has less significance now.

> Add support for rounding dates in DateField
> -------------------------------------------
>
>                 Key: SOLR-741
>                 URL: https://issues.apache.org/jira/browse/SOLR-741
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.5
>
>
> As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
> Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)
> Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.
> This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646464#action_12646464 ] 

Noble Paul commented on SOLR-741:
---------------------------------

most of the users would just need a precision like thing and it is intuitive as to how it behaves. May be roundTo can be another option (for the advanced users)

> Add support for rounding dates in DateField
> -------------------------------------------
>
>                 Key: SOLR-741
>                 URL: https://issues.apache.org/jira/browse/SOLR-741
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>
> As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
> Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)
> Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.
> This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835008#action_12835008 ] 

Hoss Man commented on SOLR-741:
-------------------------------

bq. With the introduction of Trie fields is it not irrelevant now? can we close it 

TrieFields make it more efficient to do range searches on numeric fields indexed at full precision, but it doesn't actually do anything to round the fields for people who genuinely want their stored and index values to only have second/minute/hour/day precision regardless of what the initial raw data looks like.

So while TrieFields definitely make this less of a priority from a performance standpoint, it doens't solve hte full problem.

(Unless i'm missing something, actually rounding the values prior to indexing will still help improve performance in general because it will reduce the total number of Terms ... with TrieFields isn't the original value is always indexed regardless of the precisionStep?

> Add support for rounding dates in DateField
> -------------------------------------------
>
>                 Key: SOLR-741
>                 URL: https://issues.apache.org/jira/browse/SOLR-741
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.5
>
>
> As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
> Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)
> Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.
> This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646225#action_12646225 ] 

Noble Paul commented on SOLR-741:
---------------------------------

An average user may not very familiar w/ Date Math syntax . Does this require something like DateMath. 
how about the following . 
{code}
 <!-- precision can have values 
year|month|day|hour|minute|second|millis
 -->
<fieldType name="date" class="solr.DateField" precision="minute" /> 
{code}

> Add support for rounding dates in DateField
> -------------------------------------------
>
>                 Key: SOLR-741
>                 URL: https://issues.apache.org/jira/browse/SOLR-741
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>
> As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
> Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)
> Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.
> This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646407#action_12646407 ] 

Hoss Man commented on SOLR-741:
-------------------------------

bq. An average user may not very familiar w/ Date Math syntax .

but if they're doing enough stuff with dates that they're worried about the precision they probably ought to be.

that said: a more straight forward "precision" option would certainly be better then forcing the user to know the Date Math Parser syntax if all we are supporting rounding down ... my previous suggestion was mainly along the lines of "if we want to support both rounding down or up, or support rounding to an interval (ie: 5 minutes) let's add those features to Date Math and reuse that syntax"



> Add support for rounding dates in DateField
> -------------------------------------------
>
>                 Key: SOLR-741
>                 URL: https://issues.apache.org/jira/browse/SOLR-741
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>
> As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
> Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)
> Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.
> This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796124#action_12796124 ] 

Noble Paul commented on SOLR-741:
---------------------------------

With the introduction of Trie fields is it not irrelevant now? can we close it

> Add support for rounding dates in DateField
> -------------------------------------------
>
>                 Key: SOLR-741
>                 URL: https://issues.apache.org/jira/browse/SOLR-741
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.5
>
>
> As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
> Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)
> Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.
> This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646122#action_12646122 ] 

Ryan McKinley commented on SOLR-741:
------------------------------------

I agree with adding the accuracy directly to the DateField type.

> Add support for rounding dates in DateField
> -------------------------------------------
>
>                 Key: SOLR-741
>                 URL: https://issues.apache.org/jira/browse/SOLR-741
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>
> As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
> Since rounding dates to a coarse value is an often recommended solution to decrease number of unique terms, we should add support for doing this in DateField itself. A number of syntax were proposed, some of them were:
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true"omitNorms="true" roundTo="-1MINUTE" /> (Shalin)
> # <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true" round="DOWN_MINUTE" /> (Otis)
> Hoss proposed more general enhancements related to arbitary pre-processing of values prior to indexing/storing using pre-processing analyzers.
> This issue aims to build a consensus on the solution to pursue and to implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.