You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shai Erera <se...@gmail.com> on 2009/08/27 05:51:19 UTC

Adding Field twice w/ Payload - bug or works as designed?

Hi

I don't know if it's supported or not, but I wrote the following simple
example code to describe what I want.

    Directory dir = new RAMDirectory();
    Analyzer a = new SimpleAnalyzer();
    IndexWriter writer = new IndexWriter(dir, a, MaxFieldLength.UNLIMITED);
    Document doc = new Document();
    doc.add(new Field("a", "abc", Store.NO, Index.NOT_ANALYZED));
    final Term t = new Term("a", "abc");
    doc.add(new Field(t.field(), new TokenStream() {
      boolean done = false;
      @Override
      public Token next(Token reusableToken) throws IOException {
        if (done) return null;
        done = true;
        reusableToken.setTermBuffer(t.text());
        reusableToken.setPayload(new Payload(new byte[] { 1 }));
        return reusableToken;
      }
    }));
    writer.addDocument(doc);
    writer.commit();
    writer.close();

    IndexReader reader = IndexReader.open(dir, true);
    TermPositions tp = reader.termPositions(t);
    tp.next();
    tp.nextPosition();
    System.out.println(tp.getPayloadLength());
    reader.close();

Basically, I add the same Field twice (a:abc), the second time I just set a
Payload. The program prints 0 as the payload length (1 line above the last).
If I change either the field name or field text, it prints 1.

Bug or works as designed?

Shai

Re: Adding Field twice w/ Payload - bug or works as designed?

Posted by Shai Erera <se...@gmail.com>.
Ohh, right. I missed that. Indeed after I call nextPosition again, it prints
1. Thanks !

Shai

On Thu, Aug 27, 2009 at 7:09 AM, Michael Busch <bu...@gmail.com> wrote:

> The first occurrence of your term does not have a payload, the second one
> does. So getPayloadLength() correctly returns 0, because the TermPositions
> is at the first occurrence. If you call nextPosition() again and then dump
> the payload length it should be 1.
>
>  Michael
>
>
> On 8/26/09 8:51 PM, Shai Erera wrote:
>
>> Hi
>>
>> I don't know if it's supported or not, but I wrote the following simple
>> example code to describe what I want.
>>
>>    Directory dir = new RAMDirectory();
>>    Analyzer a = new SimpleAnalyzer();
>>    IndexWriter writer = new IndexWriter(dir, a, MaxFieldLength.UNLIMITED);
>>    Document doc = new Document();
>>    doc.add(new Field("a", "abc", Store.NO, Index.NOT_ANALYZED));
>>    final Term t = new Term("a", "abc");
>>    doc.add(new Field(t.field(), new TokenStream() {
>>      boolean done = false;
>>      @Override
>>      public Token next(Token reusableToken) throws IOException {
>>        if (done) return null;
>>        done = true;
>>        reusableToken.setTermBuffer(t.text());
>>        reusableToken.setPayload(new Payload(new byte[] { 1 }));
>>        return reusableToken;
>>      }
>>    }));
>>    writer.addDocument(doc);
>>    writer.commit();
>>    writer.close();
>>
>>    IndexReader reader = IndexReader.open(dir, true);
>>    TermPositions tp = reader.termPositions(t);
>>    tp.next();
>>    tp.nextPosition();
>>    System.out.println(tp.getPayloadLength());
>>    reader.close();
>>
>> Basically, I add the same Field twice (a:abc), the second time I just set
>> a Payload. The program prints 0 as the payload length (1 line above the
>> last). If I change either the field name or field text, it prints 1.
>>
>> Bug or works as designed?
>>
>> Shai
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Adding Field twice w/ Payload - bug or works as designed?

Posted by Michael Busch <bu...@gmail.com>.
The first occurrence of your term does not have a payload, the second 
one does. So getPayloadLength() correctly returns 0, because the 
TermPositions is at the first occurrence. If you call nextPosition() 
again and then dump the payload length it should be 1.

  Michael

On 8/26/09 8:51 PM, Shai Erera wrote:
> Hi
>
> I don't know if it's supported or not, but I wrote the following 
> simple example code to describe what I want.
>
>     Directory dir = new RAMDirectory();
>     Analyzer a = new SimpleAnalyzer();
>     IndexWriter writer = new IndexWriter(dir, a, 
> MaxFieldLength.UNLIMITED);
>     Document doc = new Document();
>     doc.add(new Field("a", "abc", Store.NO, Index.NOT_ANALYZED));
>     final Term t = new Term("a", "abc");
>     doc.add(new Field(t.field(), new TokenStream() {
>       boolean done = false;
>       @Override
>       public Token next(Token reusableToken) throws IOException {
>         if (done) return null;
>         done = true;
>         reusableToken.setTermBuffer(t.text());
>         reusableToken.setPayload(new Payload(new byte[] { 1 }));
>         return reusableToken;
>       }
>     }));
>     writer.addDocument(doc);
>     writer.commit();
>     writer.close();
>
>     IndexReader reader = IndexReader.open(dir, true);
>     TermPositions tp = reader.termPositions(t);
>     tp.next();
>     tp.nextPosition();
>     System.out.println(tp.getPayloadLength());
>     reader.close();
>
> Basically, I add the same Field twice (a:abc), the second time I just 
> set a Payload. The program prints 0 as the payload length (1 line 
> above the last). If I change either the field name or field text, it 
> prints 1.
>
> Bug or works as designed?
>
> Shai


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org