You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ankit Murarka <an...@rancoretech.com> on 2013/08/13 13:45:39 UTC

Trying to store Offsets. Dont know the exact meaning of some terms.

Hello,
          I generally add fields to my document in the following manner. 
I wish to add offsets to this field.

doc.add(new StringField("contents",line,Field.Store.YES));

I wish to also store offsets. So, I went through javadoc, and found I 
need to use FieldType.

So, I ended up using :

FieldType fieldType = new  FieldType(TextField.TYPE_STORED);
                  fieldType.setIndexed(true);
                 
fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
                 fieldType.setStored(true);
                fieldType.setStoreTermVectorOffsets(true);

and then I added this field to the document in the following manner:
  doc.add(new Field("contents", line, fieldType));

Problems I encountered:
a. Exception : Exception in thread "main" 
java.lang.IllegalArgumentException: cannot index term vector offsets 
when term vectors are not indexed (field="contents
b. I hardly know what are the above setters doing.,. I googled it and 
found the above setters and hence used it.
c. I tried to understand what is Term Vector etc. but I was hardly able 
to understand it.

Kindly provide some guidance..


-- 
Regards

Ankit


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Trying to store Offsets. Dont know the exact meaning of some terms.

Posted by rizwan patel <ri...@gmail.com>.
Ankit, Term Vector is the informational guide to get the details about your
indexed information. TermOffset : is providing you the details about where
the term occurs in the given data value. e.g. "lucene is smart" Here terms
are : lucene, is, smart, TermVectorOffset would be position of the terms :
lucene : 0, is : 1, smart:2
TermVectorOffset can be stored only if you have activated/set to store the
termVectors for the given index data.
There is one more api method which will allow you to set the
TermVector(true) on the FieldType. Try using that


On Tue, Aug 13, 2013 at 5:15 PM, Ankit Murarka <
ankit.murarka@rancoretech.com> wrote:

> Hello,
>          I generally add fields to my document in the following manner. I
> wish to add offsets to this field.
>
> doc.add(new StringField("contents",line,**Field.Store.YES));
>
> I wish to also store offsets. So, I went through javadoc, and found I need
> to use FieldType.
>
> So, I ended up using :
>
> FieldType fieldType = new  FieldType(TextField.TYPE_**STORED);
>                  fieldType.setIndexed(true);
>                 fieldType.setIndexOptions(**IndexOptions.DOCS_AND_FREQS_**
> AND_POSITIONS_AND_OFFSETS);
>                 fieldType.setStored(true);
>                fieldType.**setStoreTermVectorOffsets(**true);
>
> and then I added this field to the document in the following manner:
>  doc.add(new Field("contents", line, fieldType));
>
> Problems I encountered:
> a. Exception : Exception in thread "main" java.lang.**IllegalArgumentException:
> cannot index term vector offsets when term vectors are not indexed
> (field="contents
> b. I hardly know what are the above setters doing.,. I googled it and
> found the above setters and hence used it.
> c. I tried to understand what is Term Vector etc. but I was hardly able to
> understand it.
>
> Kindly provide some guidance..
>
>
> --
> Regards
>
> Ankit
>
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org>
> For additional commands, e-mail: java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>
>


-- 
Thanks and Regards,
Rizwan

Re: Trying to store Offsets. Dont know the exact meaning of some terms.

Posted by rizwan patel <ri...@gmail.com>.
Thanks Mike, this clarifies my understanding as well.
Regds,
Rizwan


On Wed, Aug 14, 2013 at 7:56 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> I think you just need to add fieldType.setStoreTermVectors(true) as well.
>
> However, I see you are also indexing offsets into the postings, which
> is wasteful because now you've indexed offsets twice in your index.
>
> Usually only one place is needed, i.e. if you will use
> PostingsHighlighter, only index offsets into postings, but if you will
> use one of the older highlighters, only index offsets into term
> vectors.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Aug 13, 2013 at 7:45 AM, Ankit Murarka
> <an...@rancoretech.com> wrote:
> > Hello,
> >          I generally add fields to my document in the following manner. I
> > wish to add offsets to this field.
> >
> > doc.add(new StringField("contents",line,Field.Store.YES));
> >
> > I wish to also store offsets. So, I went through javadoc, and found I
> need
> > to use FieldType.
> >
> > So, I ended up using :
> >
> > FieldType fieldType = new  FieldType(TextField.TYPE_STORED);
> >                  fieldType.setIndexed(true);
> >
> >
> fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> >                 fieldType.setStored(true);
> >                fieldType.setStoreTermVectorOffsets(true);
> >
> > and then I added this field to the document in the following manner:
> >  doc.add(new Field("contents", line, fieldType));
> >
> > Problems I encountered:
> > a. Exception : Exception in thread "main"
> > java.lang.IllegalArgumentException: cannot index term vector offsets when
> > term vectors are not indexed (field="contents
> > b. I hardly know what are the above setters doing.,. I googled it and
> found
> > the above setters and hence used it.
> > c. I tried to understand what is Term Vector etc. but I was hardly able
> to
> > understand it.
> >
> > Kindly provide some guidance..
> >
> >
> > --
> > Regards
> >
> > Ankit
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Thanks and Regards,
Rizwan

Re: Trying to store Offsets. Dont know the exact meaning of some terms.

Posted by Michael McCandless <lu...@mikemccandless.com>.
I think you just need to add fieldType.setStoreTermVectors(true) as well.

However, I see you are also indexing offsets into the postings, which
is wasteful because now you've indexed offsets twice in your index.

Usually only one place is needed, i.e. if you will use
PostingsHighlighter, only index offsets into postings, but if you will
use one of the older highlighters, only index offsets into term
vectors.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Aug 13, 2013 at 7:45 AM, Ankit Murarka
<an...@rancoretech.com> wrote:
> Hello,
>          I generally add fields to my document in the following manner. I
> wish to add offsets to this field.
>
> doc.add(new StringField("contents",line,Field.Store.YES));
>
> I wish to also store offsets. So, I went through javadoc, and found I need
> to use FieldType.
>
> So, I ended up using :
>
> FieldType fieldType = new  FieldType(TextField.TYPE_STORED);
>                  fieldType.setIndexed(true);
>
> fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
>                 fieldType.setStored(true);
>                fieldType.setStoreTermVectorOffsets(true);
>
> and then I added this field to the document in the following manner:
>  doc.add(new Field("contents", line, fieldType));
>
> Problems I encountered:
> a. Exception : Exception in thread "main"
> java.lang.IllegalArgumentException: cannot index term vector offsets when
> term vectors are not indexed (field="contents
> b. I hardly know what are the above setters doing.,. I googled it and found
> the above setters and hence used it.
> c. I tried to understand what is Term Vector etc. but I was hardly able to
> understand it.
>
> Kindly provide some guidance..
>
>
> --
> Regards
>
> Ankit
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org