You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by bu...@apache.org on 2005/01/25 17:09:07 UTC
DO NOT REPLY [Bug 33239] New: -
date encoding limitation removing
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=33239>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=33239
Summary: date encoding limitation removing
Product: Lucene
Version: 1.4
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: enhancement
Priority: P5
Component: Index
AssignedTo: lucene-dev@jakarta.apache.org
ReportedBy: vbychkoviak@i-hypergrid.com
currently there is some limitation to date encoding in lucene. I think it's
because dates should preserve lexicografical ordering, i.e. if one date precedes
another date then encoded values should keep same ordering.
I know that it can be difficult to integrate it into existing version but there
is way to remove this limitation.
Date milliseconds can be encoded as unsigned values with prefix that indicates
positive or negative value.
In more details:
I used hex encoding and prefix ‘p’ and ‘n’ for positive and negative values. I
got following results:
Value -10000 is encoded with nffffffffffffd8f0,
-100 - nffffffffffffff9c
0 - p0000000000000000
100 - p0000000000000064
10000 - p0000000000002710
This preserves ordering between values and theirs encoding.
Also hex encoding can be replaced with Character.MAX_RADIX encoding.
Part of code that do this work:
final static char[] digits = {
'0' , '1' , '2' , '3' , '4' , '5' ,
'6' , '7' , '8' , '9' , 'a' , 'b' ,
'c' , 'd' , 'e' , 'f' , 'g' , 'h' ,
'i' , 'j' , 'k' , 'l' , 'm' , 'n' ,
'o' , 'p' , 'q' , 'r' , 's' , 't' ,
'u' , 'v' , 'w' , 'x' , 'y' , 'z'
};
char prefix;
if (time >= 0) {
prefix = 'p';
} else {
prefix = 'n';
}
char[] chars = new char[DATE_LEN + 1];
int index = DATE_LEN;
while (time != 0) {
int b = (int) (time & 0x0F);
chars[index--] = digits[b];
time = time >>> 4;
}
while (index >= 0) {
chars[index--] = '0';
}
chars[0] = prefix;
return new String(chars);
--
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org