You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Szymon Sutek <da...@gmail.com> on 2016/12/02 11:48:34 UTC
Unable to retrieve TermVectorOffsets using Lucene 6
Hello, I am trying to index a txt file and then retrieve it's terms offset
positions.(if it occured more than once while indexing) I present most
important parts of the code:
1)StandardAnalyzer used.
2)FieldType used while indexing.
FieldType fieldType = new FieldType();
fieldType.setTokenized(true);
fieldType.setStoreTermVectors(true);
fieldType.setStoreTermVectorPositions(true);
fieldType.setStoreTermVectorOffsets(true);
fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
3)doc.add(new Field("fieldname",reader,fieldType))
4)After succesfully creating index, I am using indexReader to read terms.
and iterate through all of them but I have no idea how to collect
offsetVector.
In earlier versions I would cast to needed vector from TermVector and get
offset List for a concrete term value. Now I stuck on this part of code:
Terms terms = indexReader.getTermVector(0,"text");
TermsEnum iterator = terms.iterator();
BytesRef byteRef = null;
while((byteRef = iterator.next()) != null) {
String term = byteRef.utf8ToString();
//Here I dont know how to get offset vector for given term
}
I would be grateful for any help!