You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by lu...@jakarta.apache.org on 2004/05/12 04:19:04 UTC
[Jakarta Lucene Wiki] Updated: SearchNumericalFields
Date: 2004-05-11T19:19:04
Editor: 150.101.152.16 <>
Wiki: Jakarta Lucene Wiki
Page: SearchNumericalFields
URL: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields
how to handle -ve numbers, by matt quail
Change Log:
------------------------------------------------------------------------------
@@ -66,3 +66,63 @@
== For decimals ==
You can use a multiplier to make sure you don't have decimals if they cause problems.(comment by sv)
+
+ == Handling positive and negitive numbers. ==
+
+ If you want a numerical field that may contain positive and negitive numbers, you still need to format them as strings. What you must ensure is that for any numbers a and b, if a<b then format(a)<format(b). The problem cases are
+ * when one number is negative and the other is positve
+ * when both are negitive, ie; -200 is less than -1, even though "-100" is lexocographically '''greater''' than "-2"
+
+ The "trick" to handle these problems are:
+ * use a prefix char for positive and negative numbers so that a negative string is always less than positive. '-' and '0' are suitable for this
+ * you have to "invert" the magnitude of negative numbers
+
+ Here is some code for a encode/decoder that does both these things for ints in the range -10000 to 9999. You could modify it to accept a double so long as you change the FORMAT appropriately.
+
+ {{{
+ private static final char NEGATIVE_PREFIX = '-';
+ // NB: NEGATIVE_PREFIX must be < POSITIVE_PREFIX
+ private static final char POSITIVE_PREFIX = '0';
+ public static final int MAX_ALLOWED = 9999;
+ public static final int MIN_ALLOWED = -10000;
+ private static final String FORMAT = "00000";
+ /**
+ * Converts a long to a String suitable for indexing.
+ */
+ public static String encode(int i) {
+ if ((i < MIN_ALLOWED) || (i > MAX_ALLOWED)) {
+ throw new IllegalArgumentException("out of allowed range");
+ }
+ char prefix;
+ if (i < 0) {
+ prefix = NEGATIVE_PREFIX;
+ i = MAX_ALLOWED + i + 1;
+ } else {
+ prefix = POSITIVE_PREFIX;
+ }
+ DecimalFormat fmt = new DecimalFormat(FORMAT);
+ return prefix + fmt.format(i);
+ }
+ /**
+ * Converts a String that was returned by {@link #encode} back to
+ * a long.
+ */
+ public static int decode(String str) {
+ char prefix = str.charAt(0);
+ int i = Integer.parseInt(str.substring(1));
+ if (prefix == POSITIVE_PREFIX) {
+ // nop
+ } else if (prefix == NEGATIVE_PREFIX) {
+ i = i - MAX_ALLOWED - 1;
+ } else {
+ throw new NumberFormatException("string does not begin with the correct prefix");
+ }
+ return i;
+ }
+ }}}
+
+ === Handling larger numbers ===
+
+ The code for a class for handling all possible long values is here. http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg04790.html
+
+ That code handles some special cases near Long.MIN_VALUE, and uses a large radix so that the resulting strings are "compressed".
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org