You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by bu...@apache.org on 2005/05/23 20:27:51 UTC
DO NOT REPLY [Bug 35029] New: -
Inconsistent Read and write behavior
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=35029>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=35029
Summary: Inconsistent Read and write behavior
Product: Lucene
Version: 1.4
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Index
AssignedTo: lucene-dev@jakarta.apache.org
ReportedBy: lucene@ziplip.com
While writing an undefined term , the field is inserted into the index as
fieldnumber -1 and while reading the same index back an exception is thrown.
First of all, the indexwriter should not allow the operation to succeed if the
field is not known.
Second, if the data is allowed to write, at least we should be able to read it
with out any problem.
If one uses the default indexreader, indexwriter and segmentmerger this may
error may not occur. However, it is simple fix for the code not to accept bad
data. Please review and commit the changes. I am not sure, if there are any
other classes that requires a similar fix. Our usage uncovered the following
files:
--
TermInfosWriter
private final void writeTerm(Term term)
throws IOException {
int iField = fieldInfos.fieldNumber(term.field);
if (iField < 0) {
throw new IOException("Unknown field "+term.field+"; term="+term.text);
}
int start = stringDifference(lastTerm.text, term.text);
int length = term.text.length() - start;
output.writeVInt(start); // write shared prefix length
output.writeVInt(length); // write delta length
output.writeChars(term.text, start, length); // write delta chars
output.writeVInt(iField); // write field num
lastTerm = term;
}
FieldsReader
final Document doc(int n) throws IOException {
indexStream.seek(n * 8L);
long position = indexStream.readLong();
fieldsStream.seek(position);
Document doc = new Document();
int numFields = fieldsStream.readVInt();
for (int i = 0; i < numFields; i++) {
int fieldNumber = fieldsStream.readVInt();
byte bits = fieldsStream.readByte();
String stFieldValue = fieldsStream.readString();
if (fieldNumber >=0) {
FieldInfo fi = fieldInfos.fieldInfo(fieldNumber);
doc.add(new Field(fi.name, // name
stFieldValue, // read value
true, // stored
fi.isIndexed, // indexed
(bits & 1) != 0)); // tokenized
}
}
return doc;
}
-- FieldsWriter.java
final void addDocument(Document doc) throws IOException {
indexStream.writeLong(fieldsStream.getFilePointer());
int storedCount = 0;
Enumeration fields = doc.fields();
while (fields.hasMoreElements()) {
Field field = (Field)fields.nextElement();
if (field.isStored())
storedCount++;
}
fieldsStream.writeVInt(storedCount);
fields = doc.fields();
while (fields.hasMoreElements()) {
Field field = (Field)fields.nextElement();
if (field.isStored()) {
int iField = fieldInfos.fieldNumber(field.name());
if (iField == -1) {
throw new IOException("Unknown field " + field.name());
}
fieldsStream.writeVInt(iField);
byte bits = 0;
if (field.isTokenized())
bits |= 1;
fieldsStream.writeByte(bits);
fieldsStream.writeString(field.stringValue());
}
}
}
--
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org