You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Vadim Gindin <vg...@detectum.com> on 2018/03/23 08:15:14 UTC
Postings.getPayload() returns null
Hi all.
I have a simplified test, that defines an index with one document and one
field in it. Just one filter is defined in indexing analyzer that writes 1
byte to payload. My *goal *is to write numbers to payload for each position
of each term, that will be used in custom scoring formula implemented in
custom Query/Scorer.
When I'm trying to read written payload (in the custom query) I got NULL. I
suppose I should call posting.nextPosition() before calling getPayload()
method, but when I called nextPosition() I got an error:
java.lang.AssertionError: got line=field model
at __randomizedtesting.SeedInfo.seed([D334C9D1B5C155E3:2AAE4BE5481F4C8F]:0)
at
org.apache.lucene.codecs.simpletext.SimpleTextFieldsReader$SimpleTextPostingsEnum.nextPosition(SimpleTextFieldsReader.java:455)
I also used SimpleTextCodec as you see to make sure that payload was really
written to index along with positions. It is really written. I probably do
something wrong with positions or reading it incorrectly or missed
something important.
*Question*: What am I doing wrong? How to read/write payload correctly?
Here is my test:
public class PayloadTest extends LuceneTestCase {
private IndexSearcher searcher;
private IndexReader reader;
private byte[] payloadField = new byte[]{1};
protected Directory directory;
private class PayloadAnalyzer extends Analyzer {
@Override
public TokenStreamComponents createComponents(String fieldName) {
Tokenizer tokenizer = new LowerCaseTokenizer();
PayloadFilter filter = new PayloadFilter(tokenizer, fieldName);
return new TokenStreamComponents(tokenizer, filter);
}
}
private class PayloadFilter extends TokenFilter {
PayloadAttribute payloadAtt;
PositionIncrementAttribute positionAtt;
public PayloadFilter(TokenStream input, String fieldName) {
super(input);
payloadAtt = addAttribute(PayloadAttribute.class);
positionAtt =
addAttribute(PositionIncrementAttribute.class); // I tried also
without position attribute here with the same error
}
@Override
public boolean incrementToken() throws IOException {
boolean hasNext = input.incrementToken();
if (hasNext) {
payloadAtt.setPayload(new BytesRef(payloadField));
positionAtt.setPositionIncrement(1); // I tried also
without position attribute here with the same error
return true;
} else {
return false;
}
}
}
@Override
public void setUp() throws Exception {
super.setUp();
directory = newDirectory();
RandomIndexWriter writer = new RandomIndexWriter(random(), directory,
newIndexWriterConfig(new PayloadAnalyzer())
.setMergePolicy(newLogMergePolicy())
.setCodec(new SimpleTextCodec()));
Document doc = new Document();
doc.add(new TextField("model", "ford focus", Field.Store.YES));
writer.addDocument(doc);
reader = writer.getReader();
writer.close();
searcher = newSearcher(reader);
}
@Override
public void tearDown() throws Exception {
reader.close();
directory.close();
super.tearDown();
}
public void testQuery() throws IOException {
int limit = 20;
try (IndexReader reader = DirectoryReader.open(directory)) {
Query query = new CustomPhraseQuery(
Arrays.asList("ford", "focus"),
new HashMap<String, Float>() {{
put("model", 5.0f);
}},
new HashMap<String, List<String>>() {{
put("ford", Arrays.asList("ford^1.0"));
put("focus", Arrays.asList("focus^1.0"));
}},
Arrays.asList("model"),
null
);
printSearchResults(limit, query, reader);
}
}
private static void printSearchResults(final int limit, final Query query,
final IndexReader reader)
throws IOException {
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs docs = searcher.search(query, limit);
System.out.println(docs.totalHits + " found for query: " + query);
for (final ScoreDoc scoreDoc : docs.scoreDocs) {
System.out.println(searcher.doc(scoreDoc.doc));
}
}
}
Here is the code from CustomPhraseQuery.scorer():
for (String field: fieldScores.keySet()) {
final Terms fieldTerms = reader.terms(field);
if (fieldTerms == null) {
continue;
}
if (!fieldTerms.hasPositions())
throw new IllegalStateException("Index does not contain positions");
if (!fieldTerms.hasPayloads())
throw new IllegalStateException("Index does not contain payloads");
final TermsEnum te = fieldTerms.iterator();
for (int j = 0; j < terms.length; j++) {
final Term t = terms[j];
if (t.field().equals(field) && te.seekExact(t.bytes())) {
PostingsEnum postingsEnum = te.postings(null, PostingsEnum.ALL);
int pos = postingsEnum.nextPosition();
BytesRef payload = postingsEnum.getPayload();
// assert payload.bytesEquals(new BytesRef(new byte[]{1}));
// TODO: use payload in scoring formula
fldScorers.add(new ConstTermScorer(this, t,
fieldScores.get(field) * termScores.get(t.text()),
postingsEnum));
}
}
}
Regards,
Vadim Gindin