You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Varun Gupta <va...@gmail.com> on 2009/12/10 13:06:26 UTC

Results after using Field Collapsing are not matching the results without using Field Collapsing

Hi,

I have documents under 6 different categories. While searching, I want to
show 3 documents from each category along with a link to see all the
documents under a single category. I decided to use field collapsing so that
I don't have to make 6 queries (one for each category). Currently I am using
the field collapsing patch uploaded on 29th Nov.

Now, the results that are coming after using field collapsing are not
matching the results for a single category. For example, for category C1, I
am getting results R1, R2 and R3 using field collapsing, but after I see
results only from the category C1 (without using field collapsing) these
results are nowhere in the first 10 results.

Am I doing something wrong or using the field collapsing for the wrong
feature?

I am using the following field collapsing parameters while querying:
   collapse.field=category
   collapse.facet=before
   collapse.threshold=3

--
Thanks
Varun Gupta

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Varun Gupta <va...@gmail.com>.
Hi Martijn,

Yes, it is working after making these changes.

--
Thanks
Varun Gupta

On Sun, Dec 20, 2009 at 5:54 PM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> Hi Varun,
>
> Yes, after going over the code I think you are right. If you change
> the following if block in SolrIndexSearcher.getDocSet(Query query,
> DocSet filter, DocSetAwareCollector collector):
> if (first==null) {
>        first = getDocSetNC(absQ, null);
>        filterCache.put(absQ,first);
> }
> with:
> if (first==null) {
>        first = getDocSetNC(absQ, null, collector);
>        filterCache.put(absQ,first);
> }
> It should work then. Let me know if this solves your problem.
>
> Martijn
>
>
> 2009/12/18 Varun Gupta <va...@gmail.com>:
> > After a lot of debugging, I finally found why the order of collapse
> results
> > are not matching the uncollapsed results. I can't say if it is a bug in
> the
> > implementation of fieldcollapse or not.
> >
> > *Explaination:*
> > Actually, I am querying the fieldcollapse with some filters to restrict
> the
> > collapsing to some particular categories only by appending the parameter:
> > fq=ctype:(1+2+8+6+3).
> >
> > In: NonAdjacentDocumentCollapser.doQuery()
> > Line: DocSet filter = searcher.getDocSet(filterQueries);
> >
> > Here, filter docset is got without any scores (since I have filter in my
> > query, this line actually gets executed) and also stored in the filter
> > cache. In the next line in the code, the actual uncollapsed DocSet is got
> > passing the DocSetScoreCollector.
> >
> > Now, in: SolrIndexSearcher.getDocSet(Query query, DocSet filter,
> > DocSetAwareCollector collector)
> > Line: if (filterCache != null)
> > Because of the filter cache not being null, and no result for the query
> in
> > the cache, the line: first = getDocSetNC(absQ,null); gets executed.
> Notice,
> > over here the DocSetScoreCollector is not passed. Hence, results are
> > collected without any scores.
> >
> > This makes the uncollapsedDocSet to be without any scores and hence the
> > sorting is not done based on score.
> >
> > @Martijn: Is what I am right or I should use field collapsing in some
> other
> > way. Else, what is the ideal fix for this problem (I am not an active
> > developer, so can't say the fix that I do will not break anything).
> >
> > --
> > Thanks,
> > Varun Gupta
> >
> >
> > On Mon, Dec 14, 2009 at 10:35 AM, Varun Gupta <varun.vgupta@gmail.com
> >wrote:
> >
> >> When I used collapse.threshold=1, out of the 5 categories 4 had the same
> >> top result, but 1 category had a different result (it was the 3rd result
> >> coming for that category when I used threshold as 3).
> >>
> >> --
> >> Thanks,
> >> Varun Gupta
> >>
> >>
> >>
> >> On Mon, Dec 14, 2009 at 2:56 AM, Martijn v Groningen <
> >> martijn.is.hier@gmail.com> wrote:
> >>
> >>> I would not expect that Solr 1.4 build is the cause of the problem.
> >>> Just out of curiosity does the same happen when collapse.threshold=1?
> >>>
> >>> 2009/12/11 Varun Gupta <va...@gmail.com>:
> >>> > Here is the field type configuration of ctype:
> >>> >    <field name="ctype" type="integer" indexed="true" stored="true"
> >>> > omitNorms="true" />
> >>> >
> >>> > In solrconfig.xml, this is how I am enabling field collapsing:
> >>> >    <searchComponent name="query"
> >>> > class="org.apache.solr.handler.component.CollapseComponent"/>
> >>> >
> >>> > Apart from this, I made no changes in solrconfig.xml for field
> collapse.
> >>> I
> >>> > am currently not using the field collapse cache.
> >>> >
> >>> > I have applied the patch on the Solr 1.4 build. I am not using the
> >>> latest
> >>> > solr nightly build. Can that cause any problem?
> >>> >
> >>> > --
> >>> > Thanks
> >>> > Varun Gupta
> >>> >
> >>> >
> >>> > On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen <
> >>> > martijn.is.hier@gmail.com> wrote:
> >>> >
> >>> >> I tried to reproduce a similar situation here, but I got the
> expected
> >>> >> and correct results. Those three documents that you saw in your
> first
> >>> >> search result should be the first in your second search result
> (unless
> >>> >> the index changes or the sort changes ) when fq on that specific
> >>> >> category. I'm not sure what is causing this problem. Can you give me
> >>> >> some more information like the field type configuration for the
> ctype
> >>> >> field and how have configured field collapsing?
> >>> >>
> >>> >> I did find another problem to do with field collapse caching. The
> >>> >> collapse.threshold or collapse.maxdocs parameters are not taken into
> >>> >> account when caching, which is off course wrong because they do
> matter
> >>> >> when collapsing. Based on the information you have given me this
> >>> >> caching problem is not the cause of the situation you have. I will
> >>> >> update the patch that fixes this problem shortly.
> >>> >>
> >>> >> Martijn
> >>> >>
> >>> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
> >>> >> > Hi Martijn,
> >>> >> >
> >>> >> > I am not sending the collapse parameters for the second query.
> Here
> >>> are
> >>> >> the
> >>> >> > queries I am using:
> >>> >> >
> >>> >> > *When using field collapsing (searching over all categories):*
> >>> >> >
> >>> >>
> >>>
> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch
> >>> >> >
> >>> >> > categories is represented as the field "ctype" above.
> >>> >> >
> >>> >> > *Without using field collapsing:*
> >>> >> >
> >>> >>
> >>>
> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch
> >>> >> >
> >>> >> > I append "&fq=ctype:1" to the above queries when trying to get
> >>> results
> >>> >> for a
> >>> >> > particular category.
> >>> >> >
> >>> >> > --
> >>> >> > Thanks
> >>> >> > Varun Gupta
> >>> >> >
> >>> >> >
> >>> >> > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
> >>> >> > martijn.is.hier@gmail.com> wrote:
> >>> >> >
> >>> >> >> Hi Varun,
> >>> >> >>
> >>> >> >> Can you send the whole requests (with params), that you send to
> Solr
> >>> >> >> for both queries?
> >>> >> >> In your situation the collapse parameters only have to be used
> for
> >>> the
> >>> >> >> first query and not the second query.
> >>> >> >>
> >>> >> >> Martijn
> >>> >> >>
> >>> >> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
> >>> >> >> > Hi,
> >>> >> >> >
> >>> >> >> > I have documents under 6 different categories. While searching,
> I
> >>> want
> >>> >> to
> >>> >> >> > show 3 documents from each category along with a link to see
> all
> >>> the
> >>> >> >> > documents under a single category. I decided to use field
> >>> collapsing
> >>> >> so
> >>> >> >> that
> >>> >> >> > I don't have to make 6 queries (one for each category).
> Currently
> >>> I am
> >>> >> >> using
> >>> >> >> > the field collapsing patch uploaded on 29th Nov.
> >>> >> >> >
> >>> >> >> > Now, the results that are coming after using field collapsing
> are
> >>> not
> >>> >> >> > matching the results for a single category. For example, for
> >>> category
> >>> >> C1,
> >>> >> >> I
> >>> >> >> > am getting results R1, R2 and R3 using field collapsing, but
> after
> >>> I
> >>> >> see
> >>> >> >> > results only from the category C1 (without using field
> collapsing)
> >>> >> these
> >>> >> >> > results are nowhere in the first 10 results.
> >>> >> >> >
> >>> >> >> > Am I doing something wrong or using the field collapsing for
> the
> >>> wrong
> >>> >> >> > feature?
> >>> >> >> >
> >>> >> >> > I am using the following field collapsing parameters while
> >>> querying:
> >>> >> >> >   collapse.field=category
> >>> >> >> >   collapse.facet=before
> >>> >> >> >   collapse.threshold=3
> >>> >> >> >
> >>> >> >> > --
> >>> >> >> > Thanks
> >>> >> >> > Varun Gupta
> >>> >> >> >
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >> --
> >>> >> >> Met vriendelijke groet,
> >>> >> >>
> >>> >> >> Martijn van Groningen
> >>> >> >>
> >>> >> >
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Met vriendelijke groet,
> >>> >>
> >>> >> Martijn van Groningen
> >>> >>
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Met vriendelijke groet,
> >>>
> >>> Martijn van Groningen
> >>>
> >>
> >>
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Martijn v Groningen <ma...@gmail.com>.
Hi Varun,

Yes, after going over the code I think you are right. If you change
the following if block in SolrIndexSearcher.getDocSet(Query query,
DocSet filter, DocSetAwareCollector collector):
if (first==null) {
        first = getDocSetNC(absQ, null);
        filterCache.put(absQ,first);
}
with:
if (first==null) {
        first = getDocSetNC(absQ, null, collector);
        filterCache.put(absQ,first);
}
It should work then. Let me know if this solves your problem.

Martijn


2009/12/18 Varun Gupta <va...@gmail.com>:
> After a lot of debugging, I finally found why the order of collapse results
> are not matching the uncollapsed results. I can't say if it is a bug in the
> implementation of fieldcollapse or not.
>
> *Explaination:*
> Actually, I am querying the fieldcollapse with some filters to restrict the
> collapsing to some particular categories only by appending the parameter:
> fq=ctype:(1+2+8+6+3).
>
> In: NonAdjacentDocumentCollapser.doQuery()
> Line: DocSet filter = searcher.getDocSet(filterQueries);
>
> Here, filter docset is got without any scores (since I have filter in my
> query, this line actually gets executed) and also stored in the filter
> cache. In the next line in the code, the actual uncollapsed DocSet is got
> passing the DocSetScoreCollector.
>
> Now, in: SolrIndexSearcher.getDocSet(Query query, DocSet filter,
> DocSetAwareCollector collector)
> Line: if (filterCache != null)
> Because of the filter cache not being null, and no result for the query in
> the cache, the line: first = getDocSetNC(absQ,null); gets executed. Notice,
> over here the DocSetScoreCollector is not passed. Hence, results are
> collected without any scores.
>
> This makes the uncollapsedDocSet to be without any scores and hence the
> sorting is not done based on score.
>
> @Martijn: Is what I am right or I should use field collapsing in some other
> way. Else, what is the ideal fix for this problem (I am not an active
> developer, so can't say the fix that I do will not break anything).
>
> --
> Thanks,
> Varun Gupta
>
>
> On Mon, Dec 14, 2009 at 10:35 AM, Varun Gupta <va...@gmail.com>wrote:
>
>> When I used collapse.threshold=1, out of the 5 categories 4 had the same
>> top result, but 1 category had a different result (it was the 3rd result
>> coming for that category when I used threshold as 3).
>>
>> --
>> Thanks,
>> Varun Gupta
>>
>>
>>
>> On Mon, Dec 14, 2009 at 2:56 AM, Martijn v Groningen <
>> martijn.is.hier@gmail.com> wrote:
>>
>>> I would not expect that Solr 1.4 build is the cause of the problem.
>>> Just out of curiosity does the same happen when collapse.threshold=1?
>>>
>>> 2009/12/11 Varun Gupta <va...@gmail.com>:
>>> > Here is the field type configuration of ctype:
>>> >    <field name="ctype" type="integer" indexed="true" stored="true"
>>> > omitNorms="true" />
>>> >
>>> > In solrconfig.xml, this is how I am enabling field collapsing:
>>> >    <searchComponent name="query"
>>> > class="org.apache.solr.handler.component.CollapseComponent"/>
>>> >
>>> > Apart from this, I made no changes in solrconfig.xml for field collapse.
>>> I
>>> > am currently not using the field collapse cache.
>>> >
>>> > I have applied the patch on the Solr 1.4 build. I am not using the
>>> latest
>>> > solr nightly build. Can that cause any problem?
>>> >
>>> > --
>>> > Thanks
>>> > Varun Gupta
>>> >
>>> >
>>> > On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen <
>>> > martijn.is.hier@gmail.com> wrote:
>>> >
>>> >> I tried to reproduce a similar situation here, but I got the expected
>>> >> and correct results. Those three documents that you saw in your first
>>> >> search result should be the first in your second search result (unless
>>> >> the index changes or the sort changes ) when fq on that specific
>>> >> category. I'm not sure what is causing this problem. Can you give me
>>> >> some more information like the field type configuration for the ctype
>>> >> field and how have configured field collapsing?
>>> >>
>>> >> I did find another problem to do with field collapse caching. The
>>> >> collapse.threshold or collapse.maxdocs parameters are not taken into
>>> >> account when caching, which is off course wrong because they do matter
>>> >> when collapsing. Based on the information you have given me this
>>> >> caching problem is not the cause of the situation you have. I will
>>> >> update the patch that fixes this problem shortly.
>>> >>
>>> >> Martijn
>>> >>
>>> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
>>> >> > Hi Martijn,
>>> >> >
>>> >> > I am not sending the collapse parameters for the second query. Here
>>> are
>>> >> the
>>> >> > queries I am using:
>>> >> >
>>> >> > *When using field collapsing (searching over all categories):*
>>> >> >
>>> >>
>>> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch
>>> >> >
>>> >> > categories is represented as the field "ctype" above.
>>> >> >
>>> >> > *Without using field collapsing:*
>>> >> >
>>> >>
>>> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch
>>> >> >
>>> >> > I append "&fq=ctype:1" to the above queries when trying to get
>>> results
>>> >> for a
>>> >> > particular category.
>>> >> >
>>> >> > --
>>> >> > Thanks
>>> >> > Varun Gupta
>>> >> >
>>> >> >
>>> >> > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
>>> >> > martijn.is.hier@gmail.com> wrote:
>>> >> >
>>> >> >> Hi Varun,
>>> >> >>
>>> >> >> Can you send the whole requests (with params), that you send to Solr
>>> >> >> for both queries?
>>> >> >> In your situation the collapse parameters only have to be used for
>>> the
>>> >> >> first query and not the second query.
>>> >> >>
>>> >> >> Martijn
>>> >> >>
>>> >> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
>>> >> >> > Hi,
>>> >> >> >
>>> >> >> > I have documents under 6 different categories. While searching, I
>>> want
>>> >> to
>>> >> >> > show 3 documents from each category along with a link to see all
>>> the
>>> >> >> > documents under a single category. I decided to use field
>>> collapsing
>>> >> so
>>> >> >> that
>>> >> >> > I don't have to make 6 queries (one for each category). Currently
>>> I am
>>> >> >> using
>>> >> >> > the field collapsing patch uploaded on 29th Nov.
>>> >> >> >
>>> >> >> > Now, the results that are coming after using field collapsing are
>>> not
>>> >> >> > matching the results for a single category. For example, for
>>> category
>>> >> C1,
>>> >> >> I
>>> >> >> > am getting results R1, R2 and R3 using field collapsing, but after
>>> I
>>> >> see
>>> >> >> > results only from the category C1 (without using field collapsing)
>>> >> these
>>> >> >> > results are nowhere in the first 10 results.
>>> >> >> >
>>> >> >> > Am I doing something wrong or using the field collapsing for the
>>> wrong
>>> >> >> > feature?
>>> >> >> >
>>> >> >> > I am using the following field collapsing parameters while
>>> querying:
>>> >> >> >   collapse.field=category
>>> >> >> >   collapse.facet=before
>>> >> >> >   collapse.threshold=3
>>> >> >> >
>>> >> >> > --
>>> >> >> > Thanks
>>> >> >> > Varun Gupta
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Met vriendelijke groet,
>>> >> >>
>>> >> >> Martijn van Groningen
>>> >> >>
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Met vriendelijke groet,
>>> >>
>>> >> Martijn van Groningen
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>> Met vriendelijke groet,
>>>
>>> Martijn van Groningen
>>>
>>
>>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Varun Gupta <va...@gmail.com>.
After a lot of debugging, I finally found why the order of collapse results
are not matching the uncollapsed results. I can't say if it is a bug in the
implementation of fieldcollapse or not.

*Explaination:*
Actually, I am querying the fieldcollapse with some filters to restrict the
collapsing to some particular categories only by appending the parameter:
fq=ctype:(1+2+8+6+3).

In: NonAdjacentDocumentCollapser.doQuery()
Line: DocSet filter = searcher.getDocSet(filterQueries);

Here, filter docset is got without any scores (since I have filter in my
query, this line actually gets executed) and also stored in the filter
cache. In the next line in the code, the actual uncollapsed DocSet is got
passing the DocSetScoreCollector.

Now, in: SolrIndexSearcher.getDocSet(Query query, DocSet filter,
DocSetAwareCollector collector)
Line: if (filterCache != null)
Because of the filter cache not being null, and no result for the query in
the cache, the line: first = getDocSetNC(absQ,null); gets executed. Notice,
over here the DocSetScoreCollector is not passed. Hence, results are
collected without any scores.

This makes the uncollapsedDocSet to be without any scores and hence the
sorting is not done based on score.

@Martijn: Is what I am right or I should use field collapsing in some other
way. Else, what is the ideal fix for this problem (I am not an active
developer, so can't say the fix that I do will not break anything).

--
Thanks,
Varun Gupta


On Mon, Dec 14, 2009 at 10:35 AM, Varun Gupta <va...@gmail.com>wrote:

> When I used collapse.threshold=1, out of the 5 categories 4 had the same
> top result, but 1 category had a different result (it was the 3rd result
> coming for that category when I used threshold as 3).
>
> --
> Thanks,
> Varun Gupta
>
>
>
> On Mon, Dec 14, 2009 at 2:56 AM, Martijn v Groningen <
> martijn.is.hier@gmail.com> wrote:
>
>> I would not expect that Solr 1.4 build is the cause of the problem.
>> Just out of curiosity does the same happen when collapse.threshold=1?
>>
>> 2009/12/11 Varun Gupta <va...@gmail.com>:
>> > Here is the field type configuration of ctype:
>> >    <field name="ctype" type="integer" indexed="true" stored="true"
>> > omitNorms="true" />
>> >
>> > In solrconfig.xml, this is how I am enabling field collapsing:
>> >    <searchComponent name="query"
>> > class="org.apache.solr.handler.component.CollapseComponent"/>
>> >
>> > Apart from this, I made no changes in solrconfig.xml for field collapse.
>> I
>> > am currently not using the field collapse cache.
>> >
>> > I have applied the patch on the Solr 1.4 build. I am not using the
>> latest
>> > solr nightly build. Can that cause any problem?
>> >
>> > --
>> > Thanks
>> > Varun Gupta
>> >
>> >
>> > On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen <
>> > martijn.is.hier@gmail.com> wrote:
>> >
>> >> I tried to reproduce a similar situation here, but I got the expected
>> >> and correct results. Those three documents that you saw in your first
>> >> search result should be the first in your second search result (unless
>> >> the index changes or the sort changes ) when fq on that specific
>> >> category. I'm not sure what is causing this problem. Can you give me
>> >> some more information like the field type configuration for the ctype
>> >> field and how have configured field collapsing?
>> >>
>> >> I did find another problem to do with field collapse caching. The
>> >> collapse.threshold or collapse.maxdocs parameters are not taken into
>> >> account when caching, which is off course wrong because they do matter
>> >> when collapsing. Based on the information you have given me this
>> >> caching problem is not the cause of the situation you have. I will
>> >> update the patch that fixes this problem shortly.
>> >>
>> >> Martijn
>> >>
>> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
>> >> > Hi Martijn,
>> >> >
>> >> > I am not sending the collapse parameters for the second query. Here
>> are
>> >> the
>> >> > queries I am using:
>> >> >
>> >> > *When using field collapsing (searching over all categories):*
>> >> >
>> >>
>> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch
>> >> >
>> >> > categories is represented as the field "ctype" above.
>> >> >
>> >> > *Without using field collapsing:*
>> >> >
>> >>
>> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch
>> >> >
>> >> > I append "&fq=ctype:1" to the above queries when trying to get
>> results
>> >> for a
>> >> > particular category.
>> >> >
>> >> > --
>> >> > Thanks
>> >> > Varun Gupta
>> >> >
>> >> >
>> >> > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
>> >> > martijn.is.hier@gmail.com> wrote:
>> >> >
>> >> >> Hi Varun,
>> >> >>
>> >> >> Can you send the whole requests (with params), that you send to Solr
>> >> >> for both queries?
>> >> >> In your situation the collapse parameters only have to be used for
>> the
>> >> >> first query and not the second query.
>> >> >>
>> >> >> Martijn
>> >> >>
>> >> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
>> >> >> > Hi,
>> >> >> >
>> >> >> > I have documents under 6 different categories. While searching, I
>> want
>> >> to
>> >> >> > show 3 documents from each category along with a link to see all
>> the
>> >> >> > documents under a single category. I decided to use field
>> collapsing
>> >> so
>> >> >> that
>> >> >> > I don't have to make 6 queries (one for each category). Currently
>> I am
>> >> >> using
>> >> >> > the field collapsing patch uploaded on 29th Nov.
>> >> >> >
>> >> >> > Now, the results that are coming after using field collapsing are
>> not
>> >> >> > matching the results for a single category. For example, for
>> category
>> >> C1,
>> >> >> I
>> >> >> > am getting results R1, R2 and R3 using field collapsing, but after
>> I
>> >> see
>> >> >> > results only from the category C1 (without using field collapsing)
>> >> these
>> >> >> > results are nowhere in the first 10 results.
>> >> >> >
>> >> >> > Am I doing something wrong or using the field collapsing for the
>> wrong
>> >> >> > feature?
>> >> >> >
>> >> >> > I am using the following field collapsing parameters while
>> querying:
>> >> >> >   collapse.field=category
>> >> >> >   collapse.facet=before
>> >> >> >   collapse.threshold=3
>> >> >> >
>> >> >> > --
>> >> >> > Thanks
>> >> >> > Varun Gupta
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Met vriendelijke groet,
>> >> >>
>> >> >> Martijn van Groningen
>> >> >>
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Met vriendelijke groet,
>> >>
>> >> Martijn van Groningen
>> >>
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>>
>
>

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Varun Gupta <va...@gmail.com>.
When I used collapse.threshold=1, out of the 5 categories 4 had the same top
result, but 1 category had a different result (it was the 3rd result coming
for that category when I used threshold as 3).

--
Thanks,
Varun Gupta


On Mon, Dec 14, 2009 at 2:56 AM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> I would not expect that Solr 1.4 build is the cause of the problem.
> Just out of curiosity does the same happen when collapse.threshold=1?
>
> 2009/12/11 Varun Gupta <va...@gmail.com>:
> > Here is the field type configuration of ctype:
> >    <field name="ctype" type="integer" indexed="true" stored="true"
> > omitNorms="true" />
> >
> > In solrconfig.xml, this is how I am enabling field collapsing:
> >    <searchComponent name="query"
> > class="org.apache.solr.handler.component.CollapseComponent"/>
> >
> > Apart from this, I made no changes in solrconfig.xml for field collapse.
> I
> > am currently not using the field collapse cache.
> >
> > I have applied the patch on the Solr 1.4 build. I am not using the latest
> > solr nightly build. Can that cause any problem?
> >
> > --
> > Thanks
> > Varun Gupta
> >
> >
> > On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen <
> > martijn.is.hier@gmail.com> wrote:
> >
> >> I tried to reproduce a similar situation here, but I got the expected
> >> and correct results. Those three documents that you saw in your first
> >> search result should be the first in your second search result (unless
> >> the index changes or the sort changes ) when fq on that specific
> >> category. I'm not sure what is causing this problem. Can you give me
> >> some more information like the field type configuration for the ctype
> >> field and how have configured field collapsing?
> >>
> >> I did find another problem to do with field collapse caching. The
> >> collapse.threshold or collapse.maxdocs parameters are not taken into
> >> account when caching, which is off course wrong because they do matter
> >> when collapsing. Based on the information you have given me this
> >> caching problem is not the cause of the situation you have. I will
> >> update the patch that fixes this problem shortly.
> >>
> >> Martijn
> >>
> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
> >> > Hi Martijn,
> >> >
> >> > I am not sending the collapse parameters for the second query. Here
> are
> >> the
> >> > queries I am using:
> >> >
> >> > *When using field collapsing (searching over all categories):*
> >> >
> >>
> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch
> >> >
> >> > categories is represented as the field "ctype" above.
> >> >
> >> > *Without using field collapsing:*
> >> >
> >>
> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch
> >> >
> >> > I append "&fq=ctype:1" to the above queries when trying to get results
> >> for a
> >> > particular category.
> >> >
> >> > --
> >> > Thanks
> >> > Varun Gupta
> >> >
> >> >
> >> > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
> >> > martijn.is.hier@gmail.com> wrote:
> >> >
> >> >> Hi Varun,
> >> >>
> >> >> Can you send the whole requests (with params), that you send to Solr
> >> >> for both queries?
> >> >> In your situation the collapse parameters only have to be used for
> the
> >> >> first query and not the second query.
> >> >>
> >> >> Martijn
> >> >>
> >> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
> >> >> > Hi,
> >> >> >
> >> >> > I have documents under 6 different categories. While searching, I
> want
> >> to
> >> >> > show 3 documents from each category along with a link to see all
> the
> >> >> > documents under a single category. I decided to use field
> collapsing
> >> so
> >> >> that
> >> >> > I don't have to make 6 queries (one for each category). Currently I
> am
> >> >> using
> >> >> > the field collapsing patch uploaded on 29th Nov.
> >> >> >
> >> >> > Now, the results that are coming after using field collapsing are
> not
> >> >> > matching the results for a single category. For example, for
> category
> >> C1,
> >> >> I
> >> >> > am getting results R1, R2 and R3 using field collapsing, but after
> I
> >> see
> >> >> > results only from the category C1 (without using field collapsing)
> >> these
> >> >> > results are nowhere in the first 10 results.
> >> >> >
> >> >> > Am I doing something wrong or using the field collapsing for the
> wrong
> >> >> > feature?
> >> >> >
> >> >> > I am using the following field collapsing parameters while
> querying:
> >> >> >   collapse.field=category
> >> >> >   collapse.facet=before
> >> >> >   collapse.threshold=3
> >> >> >
> >> >> > --
> >> >> > Thanks
> >> >> > Varun Gupta
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Met vriendelijke groet,
> >> >>
> >> >> Martijn van Groningen
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Met vriendelijke groet,
> >>
> >> Martijn van Groningen
> >>
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Martijn v Groningen <ma...@gmail.com>.
I would not expect that Solr 1.4 build is the cause of the problem.
Just out of curiosity does the same happen when collapse.threshold=1?

2009/12/11 Varun Gupta <va...@gmail.com>:
> Here is the field type configuration of ctype:
>    <field name="ctype" type="integer" indexed="true" stored="true"
> omitNorms="true" />
>
> In solrconfig.xml, this is how I am enabling field collapsing:
>    <searchComponent name="query"
> class="org.apache.solr.handler.component.CollapseComponent"/>
>
> Apart from this, I made no changes in solrconfig.xml for field collapse. I
> am currently not using the field collapse cache.
>
> I have applied the patch on the Solr 1.4 build. I am not using the latest
> solr nightly build. Can that cause any problem?
>
> --
> Thanks
> Varun Gupta
>
>
> On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen <
> martijn.is.hier@gmail.com> wrote:
>
>> I tried to reproduce a similar situation here, but I got the expected
>> and correct results. Those three documents that you saw in your first
>> search result should be the first in your second search result (unless
>> the index changes or the sort changes ) when fq on that specific
>> category. I'm not sure what is causing this problem. Can you give me
>> some more information like the field type configuration for the ctype
>> field and how have configured field collapsing?
>>
>> I did find another problem to do with field collapse caching. The
>> collapse.threshold or collapse.maxdocs parameters are not taken into
>> account when caching, which is off course wrong because they do matter
>> when collapsing. Based on the information you have given me this
>> caching problem is not the cause of the situation you have. I will
>> update the patch that fixes this problem shortly.
>>
>> Martijn
>>
>> 2009/12/10 Varun Gupta <va...@gmail.com>:
>> > Hi Martijn,
>> >
>> > I am not sending the collapse parameters for the second query. Here are
>> the
>> > queries I am using:
>> >
>> > *When using field collapsing (searching over all categories):*
>> >
>> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch
>> >
>> > categories is represented as the field "ctype" above.
>> >
>> > *Without using field collapsing:*
>> >
>> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch
>> >
>> > I append "&fq=ctype:1" to the above queries when trying to get results
>> for a
>> > particular category.
>> >
>> > --
>> > Thanks
>> > Varun Gupta
>> >
>> >
>> > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
>> > martijn.is.hier@gmail.com> wrote:
>> >
>> >> Hi Varun,
>> >>
>> >> Can you send the whole requests (with params), that you send to Solr
>> >> for both queries?
>> >> In your situation the collapse parameters only have to be used for the
>> >> first query and not the second query.
>> >>
>> >> Martijn
>> >>
>> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
>> >> > Hi,
>> >> >
>> >> > I have documents under 6 different categories. While searching, I want
>> to
>> >> > show 3 documents from each category along with a link to see all the
>> >> > documents under a single category. I decided to use field collapsing
>> so
>> >> that
>> >> > I don't have to make 6 queries (one for each category). Currently I am
>> >> using
>> >> > the field collapsing patch uploaded on 29th Nov.
>> >> >
>> >> > Now, the results that are coming after using field collapsing are not
>> >> > matching the results for a single category. For example, for category
>> C1,
>> >> I
>> >> > am getting results R1, R2 and R3 using field collapsing, but after I
>> see
>> >> > results only from the category C1 (without using field collapsing)
>> these
>> >> > results are nowhere in the first 10 results.
>> >> >
>> >> > Am I doing something wrong or using the field collapsing for the wrong
>> >> > feature?
>> >> >
>> >> > I am using the following field collapsing parameters while querying:
>> >> >   collapse.field=category
>> >> >   collapse.facet=before
>> >> >   collapse.threshold=3
>> >> >
>> >> > --
>> >> > Thanks
>> >> > Varun Gupta
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Met vriendelijke groet,
>> >>
>> >> Martijn van Groningen
>> >>
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Varun Gupta <va...@gmail.com>.
Here is the field type configuration of ctype:
    <field name="ctype" type="integer" indexed="true" stored="true"
omitNorms="true" />

In solrconfig.xml, this is how I am enabling field collapsing:
    <searchComponent name="query"
class="org.apache.solr.handler.component.CollapseComponent"/>

Apart from this, I made no changes in solrconfig.xml for field collapse. I
am currently not using the field collapse cache.

I have applied the patch on the Solr 1.4 build. I am not using the latest
solr nightly build. Can that cause any problem?

--
Thanks
Varun Gupta


On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> I tried to reproduce a similar situation here, but I got the expected
> and correct results. Those three documents that you saw in your first
> search result should be the first in your second search result (unless
> the index changes or the sort changes ) when fq on that specific
> category. I'm not sure what is causing this problem. Can you give me
> some more information like the field type configuration for the ctype
> field and how have configured field collapsing?
>
> I did find another problem to do with field collapse caching. The
> collapse.threshold or collapse.maxdocs parameters are not taken into
> account when caching, which is off course wrong because they do matter
> when collapsing. Based on the information you have given me this
> caching problem is not the cause of the situation you have. I will
> update the patch that fixes this problem shortly.
>
> Martijn
>
> 2009/12/10 Varun Gupta <va...@gmail.com>:
> > Hi Martijn,
> >
> > I am not sending the collapse parameters for the second query. Here are
> the
> > queries I am using:
> >
> > *When using field collapsing (searching over all categories):*
> >
> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch
> >
> > categories is represented as the field "ctype" above.
> >
> > *Without using field collapsing:*
> >
> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch
> >
> > I append "&fq=ctype:1" to the above queries when trying to get results
> for a
> > particular category.
> >
> > --
> > Thanks
> > Varun Gupta
> >
> >
> > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
> > martijn.is.hier@gmail.com> wrote:
> >
> >> Hi Varun,
> >>
> >> Can you send the whole requests (with params), that you send to Solr
> >> for both queries?
> >> In your situation the collapse parameters only have to be used for the
> >> first query and not the second query.
> >>
> >> Martijn
> >>
> >> 2009/12/10 Varun Gupta <va...@gmail.com>:
> >> > Hi,
> >> >
> >> > I have documents under 6 different categories. While searching, I want
> to
> >> > show 3 documents from each category along with a link to see all the
> >> > documents under a single category. I decided to use field collapsing
> so
> >> that
> >> > I don't have to make 6 queries (one for each category). Currently I am
> >> using
> >> > the field collapsing patch uploaded on 29th Nov.
> >> >
> >> > Now, the results that are coming after using field collapsing are not
> >> > matching the results for a single category. For example, for category
> C1,
> >> I
> >> > am getting results R1, R2 and R3 using field collapsing, but after I
> see
> >> > results only from the category C1 (without using field collapsing)
> these
> >> > results are nowhere in the first 10 results.
> >> >
> >> > Am I doing something wrong or using the field collapsing for the wrong
> >> > feature?
> >> >
> >> > I am using the following field collapsing parameters while querying:
> >> >   collapse.field=category
> >> >   collapse.facet=before
> >> >   collapse.threshold=3
> >> >
> >> > --
> >> > Thanks
> >> > Varun Gupta
> >> >
> >>
> >>
> >>
> >> --
> >> Met vriendelijke groet,
> >>
> >> Martijn van Groningen
> >>
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Martijn v Groningen <ma...@gmail.com>.
I tried to reproduce a similar situation here, but I got the expected
and correct results. Those three documents that you saw in your first
search result should be the first in your second search result (unless
the index changes or the sort changes ) when fq on that specific
category. I'm not sure what is causing this problem. Can you give me
some more information like the field type configuration for the ctype
field and how have configured field collapsing?

I did find another problem to do with field collapse caching. The
collapse.threshold or collapse.maxdocs parameters are not taken into
account when caching, which is off course wrong because they do matter
when collapsing. Based on the information you have given me this
caching problem is not the cause of the situation you have. I will
update the patch that fixes this problem shortly.

Martijn

2009/12/10 Varun Gupta <va...@gmail.com>:
> Hi Martijn,
>
> I am not sending the collapse parameters for the second query. Here are the
> queries I am using:
>
> *When using field collapsing (searching over all categories):*
> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch
>
> categories is represented as the field "ctype" above.
>
> *Without using field collapsing:*
> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch
>
> I append "&fq=ctype:1" to the above queries when trying to get results for a
> particular category.
>
> --
> Thanks
> Varun Gupta
>
>
> On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
> martijn.is.hier@gmail.com> wrote:
>
>> Hi Varun,
>>
>> Can you send the whole requests (with params), that you send to Solr
>> for both queries?
>> In your situation the collapse parameters only have to be used for the
>> first query and not the second query.
>>
>> Martijn
>>
>> 2009/12/10 Varun Gupta <va...@gmail.com>:
>> > Hi,
>> >
>> > I have documents under 6 different categories. While searching, I want to
>> > show 3 documents from each category along with a link to see all the
>> > documents under a single category. I decided to use field collapsing so
>> that
>> > I don't have to make 6 queries (one for each category). Currently I am
>> using
>> > the field collapsing patch uploaded on 29th Nov.
>> >
>> > Now, the results that are coming after using field collapsing are not
>> > matching the results for a single category. For example, for category C1,
>> I
>> > am getting results R1, R2 and R3 using field collapsing, but after I see
>> > results only from the category C1 (without using field collapsing) these
>> > results are nowhere in the first 10 results.
>> >
>> > Am I doing something wrong or using the field collapsing for the wrong
>> > feature?
>> >
>> > I am using the following field collapsing parameters while querying:
>> >   collapse.field=category
>> >   collapse.facet=before
>> >   collapse.threshold=3
>> >
>> > --
>> > Thanks
>> > Varun Gupta
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Varun Gupta <va...@gmail.com>.
Hi Martijn,

I am not sending the collapse parameters for the second query. Here are the
queries I am using:

*When using field collapsing (searching over all categories):*
spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch

categories is represented as the field "ctype" above.

*Without using field collapsing:*
spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch

I append "&fq=ctype:1" to the above queries when trying to get results for a
particular category.

--
Thanks
Varun Gupta


On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> Hi Varun,
>
> Can you send the whole requests (with params), that you send to Solr
> for both queries?
> In your situation the collapse parameters only have to be used for the
> first query and not the second query.
>
> Martijn
>
> 2009/12/10 Varun Gupta <va...@gmail.com>:
> > Hi,
> >
> > I have documents under 6 different categories. While searching, I want to
> > show 3 documents from each category along with a link to see all the
> > documents under a single category. I decided to use field collapsing so
> that
> > I don't have to make 6 queries (one for each category). Currently I am
> using
> > the field collapsing patch uploaded on 29th Nov.
> >
> > Now, the results that are coming after using field collapsing are not
> > matching the results for a single category. For example, for category C1,
> I
> > am getting results R1, R2 and R3 using field collapsing, but after I see
> > results only from the category C1 (without using field collapsing) these
> > results are nowhere in the first 10 results.
> >
> > Am I doing something wrong or using the field collapsing for the wrong
> > feature?
> >
> > I am using the following field collapsing parameters while querying:
> >   collapse.field=category
> >   collapse.facet=before
> >   collapse.threshold=3
> >
> > --
> > Thanks
> > Varun Gupta
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Re: Results after using Field Collapsing are not matching the results without using Field Collapsing

Posted by Martijn v Groningen <ma...@gmail.com>.
Hi Varun,

Can you send the whole requests (with params), that you send to Solr
for both queries?
In your situation the collapse parameters only have to be used for the
first query and not the second query.

Martijn

2009/12/10 Varun Gupta <va...@gmail.com>:
> Hi,
>
> I have documents under 6 different categories. While searching, I want to
> show 3 documents from each category along with a link to see all the
> documents under a single category. I decided to use field collapsing so that
> I don't have to make 6 queries (one for each category). Currently I am using
> the field collapsing patch uploaded on 29th Nov.
>
> Now, the results that are coming after using field collapsing are not
> matching the results for a single category. For example, for category C1, I
> am getting results R1, R2 and R3 using field collapsing, but after I see
> results only from the category C1 (without using field collapsing) these
> results are nowhere in the first 10 results.
>
> Am I doing something wrong or using the field collapsing for the wrong
> feature?
>
> I am using the following field collapsing parameters while querying:
>   collapse.field=category
>   collapse.facet=before
>   collapse.threshold=3
>
> --
> Thanks
> Varun Gupta
>



-- 
Met vriendelijke groet,

Martijn van Groningen