You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Adrien Grand <jp...@gmail.com> on 2019/12/03 12:56:38 UTC

Re: Multi-IDF for a single term possible?

Is there any reason why you are not storing each DOC_TYPE in its own index?

On Tue, Dec 3, 2019 at 1:50 PM Ravikumar Govindarajan
<ra...@gmail.com> wrote:
>
> Hello,
>
> We are using TF-IDF for scoring (Yet to migrate to BM25). Different
> entities (DOC_TYPES) are crunched & stored together in a single index.
>
> When it comes to IDF, I find that there is a single value computed across
> documents & stored as part of TermStats, whereas our documents are not
> homogeneous. So, a single IDF value doesn't work for us
>
> We would like to compute IDF for each <Term/DOC_TYPE> pair, store it &
> later use the paired-IDF values during query time. Is something like this
> possible via Codecs or other mechanisms?
>
> Any help is much appreciated
>
> --
> Ravi



-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Multi-IDF for a single term possible?

Posted by Robert Muir <rc...@gmail.com>.
it is enough to give each its own field.

On Tue, Dec 3, 2019 at 7:57 AM Adrien Grand <jp...@gmail.com> wrote:

> Is there any reason why you are not storing each DOC_TYPE in its own index?
>
> On Tue, Dec 3, 2019 at 1:50 PM Ravikumar Govindarajan
> <ra...@gmail.com> wrote:
> >
> > Hello,
> >
> > We are using TF-IDF for scoring (Yet to migrate to BM25). Different
> > entities (DOC_TYPES) are crunched & stored together in a single index.
> >
> > When it comes to IDF, I find that there is a single value computed across
> > documents & stored as part of TermStats, whereas our documents are not
> > homogeneous. So, a single IDF value doesn't work for us
> >
> > We would like to compute IDF for each <Term/DOC_TYPE> pair, store it &
> > later use the paired-IDF values during query time. Is something like this
> > possible via Codecs or other mechanisms?
> >
> > Any help is much appreciated
> >
> > --
> > Ravi
>
>
>
> --
> Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>