You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2007/09/18 18:01:46 UTC

[Solr Wiki] Update of "FunctionQuery" by YonikSeeley

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by YonikSeeley:
http://wiki.apache.org/solr/FunctionQuery

The comment on the change is:
overhaul + document new functions

------------------------------------------------------------------------------
+ FunctionQuery allows one to use the actual value of a numeric field and functions of those fields in a relevancy score. 
- The Javadoc - http://lucene.apache.org/solr/api/org/apache/solr/search/function/FunctionQuery.html 
- does not have much information.
  
+ [[TableOfContents]]
- A little more information is available at the javadoc for the parseFunction method under QueryParsing :
- http://lucene.apache.org/solr/api/org/apache/solr/search/QueryParsing.html#parseFunction(java.lang.String,%20org.apache.solr.schema.IndexSchema)
  
- The javadocs for different possible functions (described below ) can be seen as the subclasses of  - http://lucene.apache.org/solr/api/org/apache/solr/search/function/ValueSource.html
+ = Using FunctionQuery =
+ There are a few ways to use FunctionQuery from Solr's HTTP interface:
+  1. Embed a FunctionQuery in a regular query expressed in SolrQuerySyntax via the _val_ hook
+  2. Use a parameter that has an explicit type of FunctionQuery, such as DisMaxRequestHandler's '''bf''' (boost function) parameter.
+      NOTE: the '''bf''' parameter actually takes a list of function queries separated by whitespace and each with an optional boost.  Make sure to eliminate any internal whitespace in single function queries when using '''bf'''.
+      Example: {{{q=dismax&bf="ord(poplarity)^0.5 recip(rord(price),1,1000,1000)^0.3"}}}
  
- When specifying these in your solrconfig. Take care to :
+ = Function Query Syntax =
+ There is currently no infix parser - functions must be expressed as function calls (e.g. sum(a,b) instead of a+b)
  
+ = Available Functions =
-   1. verify there is some whitespace between the functions that you are specifying 
-   2. eliminate the space inside the recip functions - including spaces around commas - this leads to a NumberFormatException
-   3. verify that there is no space between a  function and it's boost
  
- eg. {{{'''recip(popularityRank,1,1000,1000)^2.5  recip(rord(creationDate),1,1000,1000)^1.3'''}}}
+ === constant ===
+ <!> ["Solr1.3"]
+ Floating point constants.
+     Example Syntax: '''1.5'''
  
+     SolrQuerySyntax Example: '''_val_:1.5'''
-  
-   * '''myfield''' - Field value itself can be used as a function
-   Numeric fields default to correct type
-   (ie: IntFieldSource or FloatFieldSource)
-   Others use implicit ord(...) to generate numeric field value
-   A field like popularity might be a good example use case for this. 
  
-   * '''ord(myfield)'''  - OrdFieldSource. Obtains the ordinal of the field value from the default Lucene FieldCache using getStringIndex(). The native lucene index order is used to assign an ordinal value for each field value. Field values (terms) are lexicographically ordered by unicode value, and numbered starting at 1. [[BR]]Example: If there were only three field values: "apple","banana","pear" then ord("apple")=1, ord("banana")=2, ord("pear")=3 WARNING: ord() depends on the position in an index and can thus change when other documents are inserted or deleted, or if a MultiSearcher is used. 
+ === fieldvalue ===
+ This function returns the numeric field value of an indexed field with a maximum of one value per document (not multiValued).  The syntax is simply the field name by itself.  0 is returned for documents without a value in the field.
+     Example Syntax: '''myFloatField'''
  
-   * '''rord(myfield)''' - ReverseOrdFieldSource, The reverse ordering of what ord provides - javadoc - http://lucene.apache.org/solr/api/org/apache/solr/search/function/ReverseOrdFieldSource.html 
+     SolrQuerySyntax Example: '''_val_:myFloatField'''
  
+ === ord ===
+ ord(myfield) returns the ordinal of the indexed field value within the indexed list of terms for that field in lucene index order (lexicographically ordered by unicode value), starting at 1.  The field must have a maximum of one value per document (not multiValued).  0 is returned for documents without a value in the field.
+    Example: If there were only three field values: "apple","banana","pear" then ord("apple")=1, ord("banana")=2, ord("pear")=3 
-   * '''linear(myfield,1,2)''' -  LinearFloatFunction on numeric field value. [[BR]]
-    I THINK ( this is not from documentation, please correct this)  that this function implements the value - [[BR]]
-    f(x) = m*x + c [[BR]]
-    In the above case, It will implement  myField*1 + 2
  
+    Example Syntax: '''ord(myIndexedField)'''
  
-   * '''max(linear(myfield,1,2),100)''' -   MaxFloatFunction of LinearFloatFunction on numeric field value or constant.  Returns the max of a ValueSource and a float (which is useful for "bottoming out" another function at 0.0, or some positive number). 
+    Example SolrQuerySyntax: '''_val_:"ord(myIndexedField)"'''
  
-   * '''recip(myfield,m,a,b)''' ReciprocalFloatFunction on numeric field value. ReciprocalFloatFunction implements a reciprocal function [[BR]] f(x) = a/(mx+b), based on the float value of a field as exported by ValueSource.[[BR]] When a and b are equal, and x>=0, this function has a maximum value of 1 that drops as x increases. Increasing the value of a and b together results in a movement of the entire function to a flatter part of the curve. These properties make this an ideal function for boosting more recent documents. 
+ WARNING: ord() depends on the position in an index and can thus change when other documents are inserted or deleted, or if a !MultiSearcher is used.
  
- (Insert graphs here to show the variation of f(x) as a & b change)
+ === rord ===
+ The reverse ordering of what ord provides.
+     Example Syntax: '''rord(myIndexedField)'''
  
+ === sum ===
+ <!> ["Solr1.3"]
+ sum(x,y,...) returns the sum of multiple functions.
+     Example Syntax: '''sum(x,1)'''
-  
-    * Combinations - You can combine the above functions as suitable [[BR]]
- eg. '''recip(rord(myfield),1,2,3)''' -  ReciprocalFloatFunction on ReverseOrdFieldSource
-     '''recip(linear(rord(myfield),1,2),3,4,5)''' -  ReciprocalFloatFunction on LinearFloatFunction on ReverseOrdFieldSource
-  
  
+     Example Syntax: '''sum(x,y)'''
+ 
+     Example Syntax: '''sum(sqrt(x),log(y),z,0.5)'''
+ 
+ === product ===
+ <!> ["Solr1.3"]
+ product(x,y,...) returns the product of multiple functions.
+     Example Syntax: '''product(x,2)'''
+ 
+     Example Syntax: '''product(x,y)'''
+ 
+ === div ===
+ <!> ["Solr1.3"]
+ div(x,y) divides the function x by the function y.
+     Example Syntax: '''div(1,x)'''
+ 
+     Example Syntax: '''div(sum(x,100),max(y,1))'''
+ 
+ === pow ===
+ <!> ["Solr1.3"]
+ pow(x,y) raises the base x to the power y.
+     Example Syntax: '''pow(x,0.5)'''   same as sqrt
+ 
+     Example Syntax: '''pow(x,log(y))'''
+ 
+ === abs ===
+ <!> ["Solr1.3"]
+ abs(x) returns the absolute value of a function.
+     Example Syntax: '''abs(-5)'''
+ 
+     Example Syntax: '''abs(x)'''
+ 
+ === log ===
+ <!> ["Solr1.3"]
+ log(x) returns log base 10 of the function x.
+     Example Syntax: '''log(x)'''
+ 
+     Example Syntax: '''log(sum(x,100))'''
+ 
+ === sqrt ===
+ <!> ["Solr1.3"]
+ sqrt(x) returns the square root of the function x
+     Example Syntax: '''sqrt(2)'''
+ 
+     Example Syntax: '''sqrt(sum(x,100))'''
+ 
+ === map ===
+ <!> ["Solr1.3"]
+ map(x,min,max,target) maps any values of the function x that fall within min and max inclusive to target.  min,max,target are constants.
+     Example Syntax: '''map(x,0,0,1)'''  change any values of 0 to 1... useful in handling default 0 values
+ 
+ === scale ===
+ <!> ["Solr1.3"]
+ scale(x,minTarget,maxTarget) scales values of the function x such that they fall between minTarget and maxTarget inclusive.
+     Example Syntax: '''scale(x,1,2)'''  all values will be between 1 and 2 inclusive.
+ 
+     NOTE: The current implementation currently traverses all of the function values to obtain the min and max so it can pick the correct scale.
+ 
+     NOTE: This implementation currently cannot distinguish when documents have been deleted or documents that have no value, and 0.0 values will be used for these cases.  This means that if values are normally all greater than 0.0, one can still end up with 0.0 as the min value to map from.  In these cases, an appropriate map() function could be used as a workaround to change 0.0 to a value in the real range.  example: '''scale(map(x,0,0,5),1,2)'''
+ 
+ === linear ===
+ linear(x,m,c) implements m*x+c where m and c are constants and x is an arbitrary function.  This is equivalent to '''sum(product(m,x),c)''', but slightly more efficient as it is implemented as a single function.
+     Example Syntax: '''linear(x,2,4)'''  returns 2*x+4
+ 
+ === recip ===
+ A reciprocal function with '''recip(myfield,m,a,b)''' implementing a/(m*x+b).  m,a,b are constants, x is any arbitrarily complex function.
+ 
+ When a and b are equal, and x>=0, this function has a maximum value of 1 that drops as x increases. Increasing the value of a and b together results in a movement of the entire function to a flatter part of the curve. These properties can make this an ideal function for boosting more recent documents when x is rord(datefield).
+     Example Syntax: '''recip(rord(creationDate),1,1000,1000)'''
+ 
+ === max ===
+ max(x,c) returns the max of another function and a constant.  Useful for "bottoming out" another function at some constant.
+     Example Syntax: '''max(myfield,0)'''
+ 

Re: [Solr Wiki] Update of "FunctionQuery" by YonikSeeley

Posted by Chris Hostetter <ho...@fucit.org>.
: I suppose, but there really was no "function query parser" class where
: everything could be aggregated.  package javadoc? or on FunctionQuery?

I've been trying to keep QueryParsing#parseFunction up to date, since it's 
what actually does the parsing (keep the docs close to the code) but 
package javadoc would also make sense.

Given that desire to make the parser more configurable at run time, it 
might make sense to go ahead and create a FunctionQueryParser class, 
refactor all the code in QueryParsing#parseFunction into it, and put the 
docs there.




-Hoss


Re: [Solr Wiki] Update of "FunctionQuery" by YonikSeeley

Posted by Yonik Seeley <yo...@apache.org>.
On 9/18/07, Chris Hostetter <ho...@fucit.org> wrote:
> : The comment on the change is:
> : overhaul + document new functions
>
> Shouldn't the majority of this documentation go in the javadocs?  The wiki
> can always point at the javadocs, but as a part of the release the java
> docs should really standalone without needing to point to the wiki.

I suppose, but there really was no "function query parser" class where
everything could be aggregated.  package javadoc? or on FunctionQuery?

-Yonik

Re: [Solr Wiki] Update of "FunctionQuery" by YonikSeeley

Posted by Chris Hostetter <ho...@fucit.org>.
: The comment on the change is:
: overhaul + document new functions

Shouldn't the majority of this documentation go in the javadocs?  The wiki 
can always point at the javadocs, but as a part of the release the java 
docs should really standalone without needing to point to the wiki.


-Hoss