You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Xiaolong Zheng <zh...@gmail.com> on 2016/07/19 17:08:05 UTC

Query the doc frequency across multiple search field.

Hi,

I want to know is there any way that query the doc frequency across
multiple search field?

The existing API seems only provide the query for a single search field:


  indexReader.docFreq(new Term(field, word))


Any suggestions that I could get the doc frequency from multiple field?

Thanks in Advance,

--Xiaolong

Re: Query the doc frequency across multiple search field.

Posted by Adrien Grand <jp...@gmail.com>.
Note that if you only have two fields A and B, you could make it faster by
returning `docFreq(A)+docFreq(B)-IndexSearcher.count(A AND B)` rather than
`IndexSearcher.count(A OR B)` since Lucene is typically faster at running
conjunctions than disjunctions.

Le mer. 20 juil. 2016 à 15:41, Xiaolong Zheng <zh...@gmail.com> a
écrit :

> Thanks! The use case that I am having is I am trying to calculate the
> docFreq for the suggestion word which produced by my "did you
> mean"/"spellcheck" feature.
>
> I was trying to avoid to having a second search request. But it seems in
> this case, I have to formula another search query to do the job.
>
>
>
> Sincerely,
>
> --Xiaolong
>
>
> On Wed, Jul 20, 2016 at 9:31 AM, Adrien Grand <jp...@gmail.com> wrote:
>
> > There is no way to get this statistic in constant-time. If you need it
> for
> > scoring, you need to make approximations. For instance, BlendedTermQuery
> > uses the max of the doc freqs as the aggregated doc freq.
> >
> > Otherwise, you can also compute this number by running a BooleanQuery
> with
> > one SHOULD clause per field.
> >
> > Le mar. 19 juil. 2016 à 19:08, Xiaolong Zheng <zh...@gmail.com>
> a
> > écrit :
> >
> > > Hi,
> > >
> > > I want to know is there any way that query the doc frequency across
> > > multiple search field?
> > >
> > > The existing API seems only provide the query for a single search
> field:
> > >
> > >
> > >   indexReader.docFreq(new Term(field, word))
> > >
> > >
> > > Any suggestions that I could get the doc frequency from multiple field?
> > >
> > > Thanks in Advance,
> > >
> > > --Xiaolong
> > >
> >
>

Re: Query the doc frequency across multiple search field.

Posted by Xiaolong Zheng <zh...@gmail.com>.
Thanks! The use case that I am having is I am trying to calculate the
docFreq for the suggestion word which produced by my "did you
mean"/"spellcheck" feature.

I was trying to avoid to having a second search request. But it seems in
this case, I have to formula another search query to do the job.



Sincerely,

--Xiaolong


On Wed, Jul 20, 2016 at 9:31 AM, Adrien Grand <jp...@gmail.com> wrote:

> There is no way to get this statistic in constant-time. If you need it for
> scoring, you need to make approximations. For instance, BlendedTermQuery
> uses the max of the doc freqs as the aggregated doc freq.
>
> Otherwise, you can also compute this number by running a BooleanQuery with
> one SHOULD clause per field.
>
> Le mar. 19 juil. 2016 à 19:08, Xiaolong Zheng <zh...@gmail.com> a
> écrit :
>
> > Hi,
> >
> > I want to know is there any way that query the doc frequency across
> > multiple search field?
> >
> > The existing API seems only provide the query for a single search field:
> >
> >
> >   indexReader.docFreq(new Term(field, word))
> >
> >
> > Any suggestions that I could get the doc frequency from multiple field?
> >
> > Thanks in Advance,
> >
> > --Xiaolong
> >
>

Re: Query the doc frequency across multiple search field.

Posted by Adrien Grand <jp...@gmail.com>.
There is no way to get this statistic in constant-time. If you need it for
scoring, you need to make approximations. For instance, BlendedTermQuery
uses the max of the doc freqs as the aggregated doc freq.

Otherwise, you can also compute this number by running a BooleanQuery with
one SHOULD clause per field.

Le mar. 19 juil. 2016 à 19:08, Xiaolong Zheng <zh...@gmail.com> a
écrit :

> Hi,
>
> I want to know is there any way that query the doc frequency across
> multiple search field?
>
> The existing API seems only provide the query for a single search field:
>
>
>   indexReader.docFreq(new Term(field, word))
>
>
> Any suggestions that I could get the doc frequency from multiple field?
>
> Thanks in Advance,
>
> --Xiaolong
>