You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2009/06/22 23:07:07 UTC

[jira] Created: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Set default precisionStep for NumericField and NumericRangeFilter
-----------------------------------------------------------------

                 Key: LUCENE-1712
                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
             Project: Lucene - Java
          Issue Type: Improvement
    Affects Versions: 2.9
            Reporter: Michael McCandless
            Priority: Minor
             Fix For: 2.9


This is a spinoff from LUCENE-1701.

A user using Numeric* should not need to understand what's
"under the hood" in order to do their indexing & searching.

They should be able to simply:
{code}
doc.add(new NumericField("price", 15.50);
{code}

And have a decent default precisionStep selected for them.

Actually, if we add ctors to NumericField for each of the supported
types (so the above code works), we can set the default per-type.  I
think we should do that?

4 for int and 6 for long was proposed as good defaults.

The default need not be "perfect", as advanced users can always
optimize their precisionStep, and for users experiencing slow
RangeQuery performance, NumericRangeQuery with any of the defaults we
are discussing will be much faster.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722878#action_12722878 ] 

Hoss Man commented on LUCENE-1712:
----------------------------------

behind the scenes precision changes based on which set*Value() method is called smells really wrong.

I'm not overly familiar with NumericField, but i'm i'm understanding the current suggestion, wouldn't that mkae situations like this come up...

{code}
NumericField a = new NumericField("price", MY_CUSTOM_PRECISION_STEP, ...);
a.setFloatValue(23.4f); // blows away my custom precision

NumericField b = new NumericField("price", ...);
b.setPrecisionStep(MY_CUSTOM_PRECISION_STEP);
b.setFloatValue(23.4f); // blows away my custom precision

NumericField c = new NumericField("price", ...);
c.setFloatValue(23.4f); 
c.setPrecisionStep(MY_CUSTOM_PRECISION_STEP); // only way to get my value used
{code}

...that seems sketchy, and really anoying if people try reusing NumericField instances.

If the goal is to have good "defaults" based on type then why not just have a constant per type that people can refer to explicitly? if they don't know what number to pick ... as well as a true "default" if they pick nothing.

{code}
int DEFAULT_STEP = ...;
int SUGGESTED_INT_STEP = ...;
int SUGGESTED_FLOAT_STEP = ...;
{code}






> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722840#action_12722840 ] 

Michael McCandless commented on LUCENE-1712:
--------------------------------------------

OK, I agree it's dangerous to let javac auto cast.

So, can we set an across the board default of 4?

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Assigned: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned LUCENE-1712:
------------------------------------------

    Assignee: Uwe Schindler

I'm assuming this one is yours Uwe!

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723324#action_12723324 ] 

Hoss Man commented on LUCENE-1712:
----------------------------------

{quote}
By the way: It is also a good idea to use valSize (32 or 64) as precisionStep in the case that you do not want to do range queries on the field (and use it only for sorting).
{quote}

This screams out for additional (redundant) constants that are self documenting in their names...

{code}
int PRECISION_STEP_DEFAULT = 4; // i think?
int PRECISION_STEP_SUGGESTED_INT_RANGEANDSORT = ...; // no idea what this should be
int PRECISION_STEP_SUGGESTED_INT_SORTONLY = 32; // i think?
int PRECISION_STEP_SUGGESTED_FLOAT_RANGEANDSORT = ...; // no idea what this should be
int PRECISION_STEP_SUGGESTED_FLOAT_SORTONLY = 32; // i think?
...
{code}

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722996#action_12722996 ] 

Earwin Burrfoot commented on LUCENE-1712:
-----------------------------------------

Having half of your methods constantly fail with an exception depending on constructor parameter. That just screams - "Split me into two classes!"

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722843#action_12722843 ] 

Earwin Burrfoot commented on LUCENE-1712:
-----------------------------------------

Am I misunderstanding something or the problem still persists?
Even if you use a common default, what is your base type - int or long? Are floats converted to ints, or to longs?

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723001#action_12723001 ] 

Uwe Schindler commented on LUCENE-1712:
---------------------------------------

...and unmaintainable. I merged the two classes from contrib because of this. It was just duplicate code with some small variances. Always a problem for copy/paste operations.

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723000#action_12723000 ] 

Uwe Schindler commented on LUCENE-1712:
---------------------------------------

Four classes! And with LUCENE-1710 there will be six! Not a good idea. 6 classes for NumericTokenStream, 6 for NumericField and maybe 6 for NumRangeQuery/Filter. brrrrr

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722849#action_12722849 ] 

Uwe Schindler commented on LUCENE-1712:
---------------------------------------

bq. Even if you use a common default, what is your base type - int or long? Are floats converted to ints, or to longs? 
Float are indexed like ints and doubles like longs.

The problem here is more that if you would specify the value direct in the constructor, you cannot for sure always give the right type (because Java auto-casts). This is why I have these setFloatValue(), setLongValue() and so on.

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722851#action_12722851 ] 

Earwin Burrfoot commented on LUCENE-1712:
-----------------------------------------

Aha! And each time you invoke setFloatValue/setDoubleValue it switches base type behind the scenes? Eeek.

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Closed: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler closed LUCENE-1712.
---------------------------------

    Resolution: Fixed

Committed revision: 793823

Thanks Mike!

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1712.patch, LUCENE-1712.patch
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722835#action_12722835 ] 

Michael McCandless commented on LUCENE-1712:
--------------------------------------------

bq. With a direct constructor, there would be the problem of missing data type information.

Sorry, what does that mean?

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722831#action_12722831 ] 

Uwe Schindler commented on LUCENE-1712:
---------------------------------------

Should be:

{code}
doc.add(new NumericField("price").setFloatValue(15.50f));
{code}

With a direct constructor, there would be the problem of missing data type information.

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723031#action_12723031 ] 

Michael McCandless commented on LUCENE-1712:
--------------------------------------------

bq. I propose, that all ctors/factories getting a precisionStep default it to 4

+1, with javadocs encouraging experimentation.

I think the ideas to conditionalize the default according to type add
spooky complexity

bq. By the way: It is also a good idea to use valSize (32 or 64) as precisionStep in the case that you do not want to do range queries on the field (and use it only for sorting).

Let's be sure to call out this use-case in the javadocs!


> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722859#action_12722859 ] 

Michael McCandless commented on LUCENE-1712:
--------------------------------------------

bq.  each time you invoke setFloatValue/setDoubleValue it switches base type behind the scenes?

I think that's an acceptable risk.  I suppose we could add checking to catch you but I don't think that's needed (we should document clearly that you can't "mix types").

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730538#action_12730538 ] 

Michael McCandless commented on LUCENE-1712:
--------------------------------------------

Patch looks good Uwe!

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1712.patch
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722993#action_12722993 ] 

Uwe Schindler commented on LUCENE-1712:
---------------------------------------

In my opinion, I would like to keep the precisionStep parameter required and give the 4 constants for each data type in NumericUtils.

On the other hand 4 is a (maybe) good default, so I propose, that all ctors/factories getting a precisionStep default it to 4, if left out. precisionStep is a final variable in NumericTokenStream (and so in NumericField), because it does not make sense to change it. If "field" is final, also precisionStep should be final (one field must always use the same precision step). In principle Mike is right, the type is also fixed after first calling setXxxValue, so I could throw an IAE, if somebody calles a setter for a different datatype after the first one. A IllegalStateEx is thrown, when the value was not initialized and the docinverter tries to use the token stream.

Here are two ideas to fix this the defaultPrecStep per type:

1. Special value 0 as default precStep:
- the no-precStep ctor sets the precStep in NumTokenStream to 0 (invalid value), if one is given it must be >0 and <=valsize
- when delivering the tokens, NumTokenStream uses the default for this data type if precStep==0 and the given value in all other cases
In this case the precStep is still final in NumericTokenStream, with 0 means "automatic".


2. There is one other idea:
NumericField/-TokenStream could have a required ctor param type that can be NumericField.Type.INT,... In this case the default could be choosen very simple at the beginning. And it also fixes the data type. If somebody calls setDoubleValue but has initialized the TokenStream with NumericField.Type.INT, he will get an UOE.


The javadocs should always clearly note, that one should check out a good precStep.

---
By the way: It is also a good idea to use valSize (32 or 64) as precisionStep in the case that you do not want to do range queries on the field (and use it only for sorting). RangeQueries would still work, but are as slow as conventional ones (current solr trunk contains this hint in its TrieField docs/schema)

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-1712:
----------------------------------

    Attachment: LUCENE-1712.patch

Some updates:
- remove the limitation of the precisionStep to the value size (32 or 64). To index a non-trie-encoded numeric field, you can simply use Integer.MAX_VALUE or any other value>=64 as precStep. The lower limit of precStep is 1.
- add a test for 64 bit values with recommended precStep=6
- add simple test for unlimited precStep as above
I will commit this shortly.

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1712.patch, LUCENE-1712.patch
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722839#action_12722839 ] 

Uwe Schindler commented on LUCENE-1712:
---------------------------------------

The methods setFloatValue() etc are for specifying the exact data type. If you would do it in the constructor, the resulting code would be very error-prone. e.g., new NumericField("price", 12.5): Does this mean 12.5 as double or float, is 15 alone meant as (in future) short, byte or int? OK users can add "f" in the float case to it, but this makes it very hard to prevent errors, because Java automatically casts all numeric types to each other suddenly. This is why I added these factories for NumericRangeQuery and the setters here.

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729432#action_12729432 ] 

Michael McCandless commented on LUCENE-1712:
--------------------------------------------

How about we add PRECISION_STEP_DEFAULT=4, make that the default for all types, and then note in the javadocs the "interesting" values for precision step (ie for sorting only)?

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-1712:
----------------------------------

    Attachment: LUCENE-1712.patch

Attached is a patch with the default precisionStep of 4. The javadocs of NumericRangeQuery list all possible and senseful values.

This patch also contains some cleanup in NumericUtils (rename constants) and a lot of other JavaDocs fixes. It also changes the token types of the TokenStream (no difference between 32/64 bit vals needed) and adds a test for them.

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1712.patch
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1712) Set default precisionStep for NumericField and NumericRangeFilter

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722845#action_12722845 ] 

Uwe Schindler commented on LUCENE-1712:
---------------------------------------

I think this is simple, we just overload all ctors and factories and remove the precisionStep there. These methods use NumericUtils.DEFAULT_PRECISION_STEP = 4 then. But thee should be a clear not, to do also tests with other step values.

additional variants:
- NumericField(name), NumericField(name,store,index)
- NumericTokenStream()
- NumericRangeQuery.newXxxRange(field, min, max,...)

I am currentlly not sure (I was thinking the whole time during including into core) to also make NumericRangeQuery work like the other two classes: generic Constructor without datatype and then set the range explicit:

{code}
new NumericRangeQuery(fieldname[, precisionStep]).setFloatRange(....)
{code}

Not sure, in this case the API where similar and I have to override only one ctor for different construction parameters. My only problem is, that Queries are normally almost everywhere in Lucene static and unmodifable  (beyond boost).

> Set default precisionStep for NumericField and NumericRangeFilter
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1712
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>
> This is a spinoff from LUCENE-1701.
> A user using Numeric* should not need to understand what's
> "under the hood" in order to do their indexing & searching.
> They should be able to simply:
> {code}
> doc.add(new NumericField("price", 15.50);
> {code}
> And have a decent default precisionStep selected for them.
> Actually, if we add ctors to NumericField for each of the supported
> types (so the above code works), we can set the default per-type.  I
> think we should do that?
> 4 for int and 6 for long was proposed as good defaults.
> The default need not be "perfect", as advanced users can always
> optimize their precisionStep, and for users experiencing slow
> RangeQuery performance, NumericRangeQuery with any of the defaults we
> are discussing will be much faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org