You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Piotr Pęzik <pi...@gmail.com> on 2012/12/17 13:15:43 UTC

TermVectors and Attributes in Lucene 4.0

Hi,

I've been trying to enumerate over all terms in all documents in a 
Lucene 4.0 index  in order to retrieve their attributes (payloads, 
positions etc.).

I have an index with documents containing stored, tokenized fields with 
term vectors, offsets and payloads.  Below is what I have tried so far 
(have to admit I don't fully understand this part of the 4.0 API yet).

My questions are: can I use either TermsEnum or DocsEnum or 
DocsAndPositionsEnum to access each term per each document and get its 
attributes? They all have the .attributes() method, but so far I haven't 
managed to make it return the actual attributes of individual terms (not 
even the CharTermAttribute).


Thanks,

Piotr Pezik


//Checking field type:

Document doc = dReader.document(1);
System.out.println(doc.getField("myField").fieldType());
//=> 
stored,indexed,tokenized,termVector,indexOptions=DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS

//Getting Terms and TermsEnum:

Terms terms = SlowCompositeReaderWrapper
                 .wrap(directoryReader).terms("myField");
TermsEnum tenum = terms.iterator(TermsEnum.EMPTY);

//Moving to the next term (?)

BytesRef br = tenum.next();

System.out.println(tenum.attributes().hasAttributes());

//=>FALSE

System.out.println(tenum.attributes().getAttribute(PositionIncrementAttribute.class)); 


// => java.lang.IllegalArgumentException: This AttributeSource does not 
have the attribute 
'org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute'.

Bits liveDocs = SlowCompositeReaderWrapper.wrap(dReader).getLiveDocs();


DocsEnum denum  = tenum.docs(liveDocs, null);
denum.nextDoc();
System.out.println(denum.attributes().hasAttributes());

//=>FALSE

DocsAndPositionsEnum denum2  = tenum.docsAndPositions(liveDocs, null);
denum2.nextDoc();
System.out.println(denum2.attributes().hasAttributes());

//=>FALSE




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org