You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tommaso Teofili <to...@gmail.com> on 2011/06/16 12:39:32 UTC

Showing facet of first N docs

Hi all,
Do you know if it is possible to show the facets for a particular field
related only to the first N docs of the total number of results?
It seems facet.limit doesn't help with it as it defines a window in the
facet constraints returned.
Thanks in advance,
Tommaso

Re: Showing facet of first N docs

Posted by Tommaso Teofili <to...@gmail.com>.
2011/6/18 Dmitry Kan <dm...@gmail.com>

> Do you mean you would like to boost the facets that contain the most of the
> lemmas?
>

That would be good, but I'd prefer getting facets, for example, from first
50 of 500 docs only .


> What is the user query in this case and if possible, what is the use case
> (may be some other solution exists for what you are trying to achieve)?


the use case is to help the user refining a query with the "most relevant"
facets, which in theory come from the most relevant documents.
So with 500 results being sorted by score (desc) the facet counts would come
also from the documents ranked 490 to 500, which contain less relevant
information.


2011/6/18 lee carroll <le...@googlemail.com>

> Hi Tommaso
>
> I don't think you can achieve what you want using vanilla solr.
> Facet counts will be for the result set matching not for the top n
> result sets matching.
>
> However what is your use case ? Assuming its for faceted navigation
> showing facets for the
> top n result sets could be confusing to your users. As the next
> incremental filter applied by the user would change the "relevancy
> focus" of the user and produce another set of top n facet counts with
> a document set un-related to the last result set. This could be a very
> bad user experience producing a fluctuating facet counts (ie a filter
> narrowing the search could produce an increase in a facet term count -
> very odd) also the result set could change strangely with docs
> floating in and out of the result list.
>

Right :-) Thanks for pointing this out.


>
> relevancy seems to be the answer here - if your docs are scored
> correctly then counting all docs in the result set for the facet
> counts is correct. do you need to improve relevancy?


I have a quite good relevance obtained after playing a bit with dismax and
bq.
I think the problem is just in how the facets are being used, I think a
customized SpellChecker sounds like the right component to provide smart
suggestions.


2011/6/20 Toke Eskildsen <te...@statsbiblioteket.dk>

> On Thu, 2011-06-16 at 12:39 +0200, Tommaso Teofili wrote:
> > Do you know if it is possible to show the facets for a particular field
> > related only to the first N docs of the total number of results?
>
> It collides with the inner working in Solr, as faceting does not process
> the doc-IDs from the matching documents in result order. It also uses
> all the hits, but that could be hacked.
>
> What is N? If it is a fairly low number (hundreds) and your documents
> are indexed with an unique ID, you can extract the IDs and perform a
> facet-request with the ORed IDs as query.
>
>
> I am a bit curious about what you're trying to achieve here.
> Conventionally, faceting provides an overview of all data, often
> prioritized by occurrence count. While I understand the idea of trying
> to use weights to prioritize, limiting the faceting to a subset of the
> result set seems very much like a standard ranked document search.
>
>
my use case (that is my customer's) sounds like a mixed one; as I said I
suspect that an interesting try would be mixing the spellcheck's result with
facets using spellcheck's suggestions as facet queries.

Thanks all for your responses as they were very useful to understand how to
face my use case.
Regards,
Tommaso

Re: Showing facet of first N docs

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Thu, 2011-06-16 at 12:39 +0200, Tommaso Teofili wrote:
> Do you know if it is possible to show the facets for a particular field
> related only to the first N docs of the total number of results?

It collides with the inner working in Solr, as faceting does not process
the doc-IDs from the matching documents in result order. It also uses
all the hits, but that could be hacked.

What is N? If it is a fairly low number (hundreds) and your documents
are indexed with an unique ID, you can extract the IDs and perform a
facet-request with the ORed IDs as query.


I am a bit curious about what you're trying to achieve here.
Conventionally, faceting provides an overview of all data, often
prioritized by occurrence count. While I understand the idea of trying
to use weights to prioritize, limiting the faceting to a subset of the
result set seems very much like a standard ranked document search.


Re: Showing facet of first N docs

Posted by ka...@gmx.de.
Hi Tommaso,

the FacetComponent works with the DocListAndSet#docSet.
It should be easy to switch to DocListAndSet#docList (which contains all documents for result list (default: TOP-10, but possible 15-25 (if start=15, rows=11). Which means to change the source code.

Instead of changing the source-code the easier way should be to send a second request with relevance-Filter (if your sort-criteria is relevance):
 http://lucene.472066.n3.nabble.com/Filter-by-relevance-td1837486.html

Best regards
  Karsten

http://lucene.472066.n3.nabble.com/Showing-facet-of-first-N-docs-td3071395.html
-------- Original-Nachricht --------
> Datum: Thu, 16 Jun 2011 12:39:32 +0200
> Von: Tommaso Teofili <to...@gmail.com>
> An: solr-user@lucene.apache.org
> Betreff: Showing facet of first N docs

> Hi all,
> Do you know if it is possible to show the facets for a particular field
> related only to the first N docs of the total number of results?
> It seems facet.limit doesn't help with it as it defines a window in the
> facet constraints returned.
> Thanks in advance,
> Tommaso

Re: Showing facet of first N docs

Posted by lee carroll <le...@googlemail.com>.
Hi Tommaso

I don't think you can achieve what you want using vanilla solr.
Facet counts will be for the result set matching not for the top n
result sets matching.

However what is your use case ? Assuming its for faceted navigation
showing facets for the
top n result sets could be confusing to your users. As the next
incremental filter applied by the user would change the "relevancy
focus" of the user and produce another set of top n facet counts with
a document set un-related to the last result set. This could be a very
bad user experience producing a fluctuating facet counts (ie a filter
narrowing the search could produce an increase in a facet term count -
very odd) also the result set could change strangely with docs
floating in and out of the result list.

relevancy seems to be the answer here - if your docs are scored
correctly then counting all docs in the result set for the facet
counts is correct. do you need to improve relevancy?




On 18 June 2011 08:23, Dmitry Kan <dm...@gmail.com> wrote:
> Do you mean you would like to boost the facets that contain the most of the
> lemmas?
> What is the user query in this case and if possible, what is the use case
> (may be some other solution exists for what you are trying to achieve)?
>
> On Thu, Jun 16, 2011 at 5:23 PM, Tommaso Teofili
> <to...@gmail.com>wrote:
>
>> Thanks Dmitry, but maybe I didn't explain correctly as I am not sure
>> facet.offset is the right solution, I'd like not to page but to filter
>> facets.
>> I'll try to explain better with an example.
>> Imagine I make a query and first 2 docs in results have both 'xyz' and
>> 'abc'
>> as values for field 'lemmas' while also other docs in the results have
>> 'xyz'
>> or 'abc' as values of field 'lemmas' then I would like to show facets
>> "coming from" only the first 2 docs in the results thus having :
>> <lst name="lemmas">
>>  <str name="xyz">2</str>
>>  <str name="abc">2</str>
>> </lst>
>> You can imagine this like a 'give me only facets related to the most
>> relevant docs in the results' functionality.
>> Any idea on how to do that?
>> Tommaso
>>
>>
>> 2011/6/16 Dmitry Kan <dm...@gmail.com>
>>
>> > http://wiki.apache.org/solr/SimpleFacetParameters
>> > facet.offset
>> >
>> > This param indicates an offset into the list of constraints to allow
>> > paging.
>> >
>> > The default value is 0.
>> >
>> > This parameter can be specified on a per field basis.
>> >
>> >
>> > Dmitry
>> >
>> >
>> > On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili
>> > <to...@gmail.com>wrote:
>> >
>> > > Hi all,
>> > > Do you know if it is possible to show the facets for a particular field
>> > > related only to the first N docs of the total number of results?
>> > > It seems facet.limit doesn't help with it as it defines a window in the
>> > > facet constraints returned.
>> > > Thanks in advance,
>> > > Tommaso
>> > >
>> >
>> >
>> >
>> > --
>> > Regards,
>> >
>> > Dmitry Kan
>> >
>>
>
>
>
> --
> Regards,
>
> Dmitry Kan
>

Re: Showing facet of first N docs

Posted by Dmitry Kan <dm...@gmail.com>.
Do you mean you would like to boost the facets that contain the most of the
lemmas?
What is the user query in this case and if possible, what is the use case
(may be some other solution exists for what you are trying to achieve)?

On Thu, Jun 16, 2011 at 5:23 PM, Tommaso Teofili
<to...@gmail.com>wrote:

> Thanks Dmitry, but maybe I didn't explain correctly as I am not sure
> facet.offset is the right solution, I'd like not to page but to filter
> facets.
> I'll try to explain better with an example.
> Imagine I make a query and first 2 docs in results have both 'xyz' and
> 'abc'
> as values for field 'lemmas' while also other docs in the results have
> 'xyz'
> or 'abc' as values of field 'lemmas' then I would like to show facets
> "coming from" only the first 2 docs in the results thus having :
> <lst name="lemmas">
>  <str name="xyz">2</str>
>  <str name="abc">2</str>
> </lst>
> You can imagine this like a 'give me only facets related to the most
> relevant docs in the results' functionality.
> Any idea on how to do that?
> Tommaso
>
>
> 2011/6/16 Dmitry Kan <dm...@gmail.com>
>
> > http://wiki.apache.org/solr/SimpleFacetParameters
> > facet.offset
> >
> > This param indicates an offset into the list of constraints to allow
> > paging.
> >
> > The default value is 0.
> >
> > This parameter can be specified on a per field basis.
> >
> >
> > Dmitry
> >
> >
> > On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili
> > <to...@gmail.com>wrote:
> >
> > > Hi all,
> > > Do you know if it is possible to show the facets for a particular field
> > > related only to the first N docs of the total number of results?
> > > It seems facet.limit doesn't help with it as it defines a window in the
> > > facet constraints returned.
> > > Thanks in advance,
> > > Tommaso
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Dmitry Kan
> >
>



-- 
Regards,

Dmitry Kan

Re: Showing facet of first N docs

Posted by Tommaso Teofili <to...@gmail.com>.
Thanks Dmitry, but maybe I didn't explain correctly as I am not sure
facet.offset is the right solution, I'd like not to page but to filter
facets.
I'll try to explain better with an example.
Imagine I make a query and first 2 docs in results have both 'xyz' and 'abc'
as values for field 'lemmas' while also other docs in the results have 'xyz'
or 'abc' as values of field 'lemmas' then I would like to show facets
"coming from" only the first 2 docs in the results thus having :
<lst name="lemmas">
  <str name="xyz">2</str>
  <str name="abc">2</str>
</lst>
You can imagine this like a 'give me only facets related to the most
relevant docs in the results' functionality.
Any idea on how to do that?
Tommaso


2011/6/16 Dmitry Kan <dm...@gmail.com>

> http://wiki.apache.org/solr/SimpleFacetParameters
> facet.offset
>
> This param indicates an offset into the list of constraints to allow
> paging.
>
> The default value is 0.
>
> This parameter can be specified on a per field basis.
>
>
> Dmitry
>
>
> On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili
> <to...@gmail.com>wrote:
>
> > Hi all,
> > Do you know if it is possible to show the facets for a particular field
> > related only to the first N docs of the total number of results?
> > It seems facet.limit doesn't help with it as it defines a window in the
> > facet constraints returned.
> > Thanks in advance,
> > Tommaso
> >
>
>
>
> --
> Regards,
>
> Dmitry Kan
>

Re: Showing facet of first N docs

Posted by Dmitry Kan <dm...@gmail.com>.
http://wiki.apache.org/solr/SimpleFacetParameters
facet.offset

This param indicates an offset into the list of constraints to allow paging.

The default value is 0.

This parameter can be specified on a per field basis.


Dmitry


On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili
<to...@gmail.com>wrote:

> Hi all,
> Do you know if it is possible to show the facets for a particular field
> related only to the first N docs of the total number of results?
> It seems facet.limit doesn't help with it as it defines a window in the
> facet constraints returned.
> Thanks in advance,
> Tommaso
>



-- 
Regards,

Dmitry Kan