You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Bill Bell (JIRA)" <ji...@apache.org> on 2010/11/18 01:07:14 UTC

[jira] Created: (SOLR-2242) Get distinct count of names for a facet field

Get distinct count of names for a facet field
---------------------------------------------

                 Key: SOLR-2242
                 URL: https://issues.apache.org/jira/browse/SOLR-2242
             Project: Solr
          Issue Type: New Feature
          Components: Response Writers
    Affects Versions: 4.0
            Reporter: Bill Bell
            Priority: Minor
             Fix For: 4.0


See SOLR-236.

Need ability to get "count" back for the unique facets for grouping (field collapsing) instead of returning the facets. 



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: [jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by Jason Rutherglen <ja...@gmail.com>.
Bill, which patch is working for you?  It is difficult to follow! :)

On Sat, Jun 9, 2012 at 1:02 AM, William Bell <bi...@gmail.com> wrote:

> I am not sure what the issue is.
>
> This is working for me...
>
> On Fri, Jun 8, 2012 at 8:35 AM, Jason Rutherglen (JIRA) <ji...@apache.org>
> wrote:
> >
> >    [
> https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291803#comment-13291803]
> >
> > Jason Rutherglen commented on SOLR-2242:
> > ----------------------------------------
> >
> > Terrance, can you post a patch to the Jira?  It makes sense to start
> this Jira off non-distributed, and add a distributed version in another
> Jira issue...
> >
> >> Get distinct count of names for a facet field
> >> ---------------------------------------------
> >>
> >>                 Key: SOLR-2242
> >>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
> >>             Project: Solr
> >>          Issue Type: New Feature
> >>          Components: Response Writers
> >>    Affects Versions: 4.0
> >>            Reporter: Bill Bell
> >>            Priority: Minor
> >>             Fix For: 4.0
> >>
> >>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch,
> SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch,
> SOLR-2242.patch, SOLR-2242.shard.withtests.patch,
> SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch,
> SOLR.2242.solr3.1.patch
> >>
> >>
> >> When returning facet.field=<name of field> you will get a list of
> matches for distinct values. This is normal behavior. This patch tells you
> how many distinct values you have (# of rows). Use with limit=-1 and
> mincount=1.
> >> The feature is called "namedistinct". Here is an example:
> >>
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> >>
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> >>
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> >> This currently only works on facet.field.
> >> {code}
> >> <lst name="facet_fields">
> >>   <lst name="price">
> >>     <int name="numFacetTerms">14</int>
> >>     <int name="0.0">3</int><int name="11.5">1</int><int
> name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int
> name="179.99">1</int><int name="185.0">1</int><int
> name="279.95">1</int><int name="329.95">1</int><int
> name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int
> name="649.99">1</int><int name="2199.0">1</int>
> >>   </lst>
> >> </lst>
> >> {code}
> >> Several people use this to get the group.field count (the # of groups).
> >
> > --
> > This message is automatically generated by JIRA.
> > If you think it was sent incorrectly, please contact your JIRA
> administrators:
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> > For more information on JIRA, see:
> http://www.atlassian.com/software/jira
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
>
> --
> Bill Bell
> billnbell@gmail.com
> cell 720-256-8076
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: [jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by William Bell <bi...@gmail.com>.
I am not sure what the issue is.

This is working for me...

On Fri, Jun 8, 2012 at 8:35 AM, Jason Rutherglen (JIRA) <ji...@apache.org> wrote:
>
>    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291803#comment-13291803 ]
>
> Jason Rutherglen commented on SOLR-2242:
> ----------------------------------------
>
> Terrance, can you post a patch to the Jira?  It makes sense to start this Jira off non-distributed, and add a distributed version in another Jira issue...
>
>> Get distinct count of names for a facet field
>> ---------------------------------------------
>>
>>                 Key: SOLR-2242
>>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>>             Project: Solr
>>          Issue Type: New Feature
>>          Components: Response Writers
>>    Affects Versions: 4.0
>>            Reporter: Bill Bell
>>            Priority: Minor
>>             Fix For: 4.0
>>
>>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>>
>>
>> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
>> The feature is called "namedistinct". Here is an example:
>> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
>> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
>> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
>> This currently only works on facet.field.
>> {code}
>> <lst name="facet_fields">
>>   <lst name="price">
>>     <int name="numFacetTerms">14</int>
>>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>>   </lst>
>> </lst>
>> {code}
>> Several people use this to get the group.field count (the # of groups).
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>



-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Comment: was deleted

(was: New ver)

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026823#comment-13026823 ] 

Lance Norskog commented on SOLR-2242:
-------------------------------------

I changed it to 'facet.numTerms'.

There is still a big performance problem: numTerms builds the entire list of facets and then reports the length of the list. This could be done more efficiently. 

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Antoine Le Floc'h (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174739#comment-13174739 ] 

Antoine Le Floc'h edited comment on SOLR-2242 at 12/22/11 11:02 AM:
--------------------------------------------------------------------

To help with the specification, my use case is this: I am using this patch and possibly want to add extra infos in the facet results, and want to use sharding... Basically, this is what I have today with the patch:
{code}
<lst name="shop_id">
  <int name="numTerms">10251</int>
  <lst name="counts">
    <int name="28013756">7032406</int>
    <int name="28009589">3616625</int>
    <int name="976">3497825</int>
    <int name="635">1398780</int>
    <int name="28021713">440118</int>
    <int name="29047336">368921</int>
    <int name="411">244689</int>
  </lst>
</lst>
{code}
and I want to subclass/modify SimpleFacets to add more data for each item (since I don't see other way to do it)
                
      was (Author: alefloch):
    To help with the specification, my use case is this: I am using this patch and possibly want to add extra infos in the facet results, and want to use sharding... Basically, this is what I have today with the patch:

<lst name="shop_id">
  <int name="numTerms">10251</int>
  <lst name="counts">
    <int name="28013756">7032406</int>
    <int name="28009589">3616625</int>
    <int name="976">3497825</int>
    <int name="635">1398780</int>
    <int name="28021713">440118</int>
    <int name="29047336">368921</int>
    <int name="411">244689</int>
  </lst>
</lst>

and I want to subclass/modify SimpleFacets to add more data for each item (since I don't see other way to do it)
                  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242-notworkingtest.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026124#comment-13026124 ] 

Lance Norskog edited comment on SOLR-2242 at 4/28/11 5:33 AM:
--------------------------------------------------------------

Putting up or shutting up :)

This splits apart whether to count terms v.s. whether to count docs per term. They are independent concepts.

Instead of 'numFacetTerms=0/1/2' it is 'numTerms=true/false'.
if you set 'numTerms=true', it counts terms.
If you set facet.limit=0, it does not do the facet search. It does not count docs per term.
If you set 'numTerms=false' and 'facet.limit=0', it does nothing.

And, everything is called 'facet' and 'term' :)


      was (Author: lancenorskog):
    Putting up or shutting up :)

  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson updated SOLR-2242:
---------------------------------

    Attachment: SOLR-2242-3x.patch

This patch applies against the 3.x code line, Bill you might want to check it, I had to do some merging by hand.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242.shard.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Antoine Le Floc'h (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233453#comment-13233453 ] 

Antoine Le Floc'h commented on SOLR-2242:
-----------------------------------------

Bill,

Just a thought, how are you going to plug in [SOLR-3134|https://issues.apache.org/jira/browse/SOLR-3134] then ?
Since we are not able to aggregate distinct count over shards, shouldn't you do something like:
{code}
<lst name="facet_numTerms">
  <lst name="localhost:7777/solr">
    <int name="cat">15</int>
    <int name="price">14</int>
  </lst>
  <lst name="localhost:8888/solr">
    <int name="cat">3</int>
    <int name="price">23</int>
  </lst>
</lst>
{code}

                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson updated SOLR-2242:
---------------------------------

    Attachment: SOLR-2242.patch

First step in resurrecting this. This patch should apply cleanly to trunk. It incorporates the SOLR-2242.patch from 28-June and the NmFacetTermsFacetsTest from 9-July. It accounts for the fact that things seem to have been moved around a bit.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Peter Sturge (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006445#comment-13006445 ] 

Peter Sturge commented on SOLR-2242:
------------------------------------

+1 Yep, me too. Useful feature, this.


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006792#comment-13006792 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 8:22 AM:
----------------------------------------------------------

I am going to use your suggestion. You will not have to set the limit. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numFacetTerms=1

This assumes the count will be limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numFacetTerms=2

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}

      was (Author: billnbell):
    I am going to use your suggestion. You will not have to set the limit. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=1

This assumes the count will be limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=2

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: [jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by Bill Bell <bi...@gmail.com>.
Thanks. 

Not sure how to get the facet distinct count without looping, but I'll
look into that. Not sure what "constraints" means?

I agree that you should not have to specify limit, but mincount should
apply, since many times I want 1 or higher.

Would be always include this or just add it as an option?

f.hgid.facet.namedistinct=1 ?

Proposed:
{code}
"facet fields" : {"hgid" : {
  "missing" : 25,
  "namedistinct" : 1250,
  "counts" : ["constraint",10,...]
}}
{code}


Then we add others as needed?

Or do you mean?

f.hgid.facet.constraints = namedistinct() with the option to specify more
than one?

f.hgid.facet.constraints = namedistinct(),missing()


Proposed:
{code}
"facet fields" : {"hgid" : {
  "constraints" : ["missing()",25,"namedistinct()",1250],
  "counts" : ["constraint",10,...]
}}
{code}



On 3/14/11 7:05 PM, "Yonik Seeley (JIRA)" <ji...@apache.org> wrote:

>
>    [ 
>https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.pl
>ugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006750#comm
>ent-13006750 ] 
>
>Yonik Seeley commented on SOLR-2242:
>------------------------------------
>
>It feels like we should have an option to return the number of
>constraints that match the criteria (mincount, etc) w/o having to specify
>facet.limit=-1, and you should be able to get this info in addition to
>the normal facet counts.  We can also improve the efficiency by not
>building the complete list in memory just to return it's count.
>
>We've also talked before about having an extra metadata level for each
>facet.
>
>Current:
>{code}
>"facet fields" : {"hgid" : ["constraint",10,...]}
>{code}
>
>Proposed:
>{code}
>"facet fields" : {"hgid" : {
>  "missing" : 25,
>  "constraints" : 1250,
>  "counts" : ["constraint",10,...]
>}}
>{code}
>
>> Get distinct count of names for a facet field
>> ---------------------------------------------
>>
>>                 Key: SOLR-2242
>>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>>             Project: Solr
>>          Issue Type: New Feature
>>          Components: Response Writers
>>    Affects Versions: 4.0
>>            Reporter: Bill Bell
>>            Priority: Minor
>>             Fix For: 4.0
>>
>>         Attachments: SOLR-2242-distinctFacet.patch
>>
>>
>> When returning facet.field=<name of field> you will get a list of
>>matches for distinct values. This is normal behavior. This patch tells
>>you how many distinct values you have (# of rows). Use with limit=-1 and
>>mincount=1.
>> The feature is called "namedistinct". Here is an example:
>> 
>>http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet
>>.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&
>>f.price.facet.namedistinct=1
>> Here is an example on field "hgid" (without namedistinct):
>> {code}
>> - <lst name="facet_fields">
>> - <lst name="hgid">
>>   <int name="HGPY0000045FD36D4000A">1</int>
>>   <int name="HGPY00000FBC6690453A9">1</int>
>>   <int name="HGPY00001E44ED6C4FB3B">1</int>
>>   <int name="HGPY00001FA631034A1B8">1</int>
>>   <int name="HGPY00003317ABAC43B48">1</int>
>>   <int name="HGPY00003A17B2294CB5A">5</int>
>>   <int name="HGPY00003ADD2B3D48C39">1</int>
>>   </lst>
>>   </lst>
>> {code}
>> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9,
>>HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48,
>>HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of
>>rows (7), not the number of values (11).
>> {code}
>> - <lst name="facet_fields">
>> - <lst name="hgid">
>>   <int name="_count_">7</int>
>>   </lst>
>>   </lst>
>> {code}
>> This works actually really good to get total number of fields for a
>>group.field=hgid. Enjoy!
>
>--
>This message is automatically generated by JIRA.
>For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>For additional commands, e-mail: dev-help@lucene.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006750#comment-13006750 ] 

Yonik Seeley commented on SOLR-2242:
------------------------------------

It feels like we should have an option to return the number of constraints that match the criteria (mincount, etc) w/o having to specify facet.limit=-1, and you should be able to get this info in addition to the normal facet counts.  We can also improve the efficiency by not building the complete list in memory just to return it's count.

We've also talked before about having an extra metadata level for each facet.

Current:
{code}
"facet fields" : {"hgid" : ["constraint",10,...]}
{code}

Proposed:
{code}
"facet fields" : {"hgid" : {
  "missing" : 25,
  "constraints" : 1250,
  "counts" : ["constraint",10,...]
}}
{code}

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242.v2.patch)

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049077#comment-13049077 ] 

Simon Willnauer commented on SOLR-2242:
---------------------------------------

Hey bill,
I looked at your patch and I have some comments:

* you should fix white-spaces within the try {} catch block in SimpleFacets

* I think you should alsom make the constant came consistent with facet parameter s/FACET_NAMEDISTINCT/FACTE_NUM_FACET_TERMS/
* as lance noted (in a not necessarily appropriate tone but this is a different issue)switch to a constant / enum rather than a number something like [ COUNTS, COUNTS_AND_VALUES ]
* if the termList is not null the results are all implicit meaning its always the number of terms you specify in the term list, right? I think we should not support this eg. only compute the count if no term list is specified
* If you are asking for COUNTS_AND_FACETS (the 2 case) if seems we should check if the limit is already -1 so we don't comput that twice?
* I think you should use a switch / case or an if ELSE construct instead of having 3 plain if statements

I only considered the last patch you uploaded let me know if I should look at something else?

Simon

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Dmitry Drozdov (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitry Drozdov updated SOLR-2242:
---------------------------------

    Attachment: SOLR.2242.solr3.1.patch

Thanks for the patch!
It also works for version 3.1, just the line numbers differ - attaching the adopted patch for 3.1 just in case.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242.v2.patch

v2 of the release based on feedback.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-3x_5_tests.patch

3X version with test cases
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242-3x_4.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lance Norskog updated SOLR-2242:
--------------------------------

    Attachment: SOLR-2242.solr3.1.patch

Putting up or shutting up :)


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233232#comment-13233232 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

See https://issues.apache.org/jira/secure/attachment/12519024/SOLR-2242-solr40.patch for the patch.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Nguyen Kien Trung (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nguyen Kien Trung updated SOLR-2242:
------------------------------------

    Attachment: SOLR-2242.solr3.1-fix.patch

I'm using Solr 3.2. Instead of patching, I extend {{SimpleFacets}} and {{FacetComponent}}, apply the changes mentioned in [^SOLR-2242.solr3.1.patch] with a small fix ([^SOLR\-2242.solr3.1-fix.patch]).
{code}
int offset = params.getFieldInt(facetValue, FacetParams.FACET_OFFSET, 0);
....
resCount.add("numTerms", counts.size() + offset);
{code}

as {{counts}} contains list of terms started from the given {{offset}}

It accepts param {{facet.numTerms=true|false}} and produce the output
{code}
<lst name="facet_fields">
   <lst name="color">
      <int name="numTerms">124</int>
      <lst name="counts" />
          <int name="red">4</int>
          <int name="blue">3</int>
      </lst>
   </lst>
</lst>
{code}
Not yet tested with sharding

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Cody Young (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173479#comment-13173479 ] 

Cody Young commented on SOLR-2242:
----------------------------------

Simon, any plans for this patch? 

The general consensus seems to be that this is a good patch and desired functionality. The biggest issues seem to be the magic name and distributed support. I see a proposed solution by Yonik of changing the output format but that breaks distributed search. In addition, there is a worry about backwards compatibility and possibly supporting that through a parameter.

What if we choose a format that doesn't break backwards compatibility and possibly commit without supporting distributed for the first pass (or supporting the simple case of just adding it all together). This would let us get some progress on this issue without having a magic name in the facet list.

If we went with a format like below then it wouldn't break backwards compatibility and it shouldn't affect anyone unless they choose to use the feature. This is also consistent with the way numFound works for the main search results. (Admittedly, it's different than ngroups, although we still see numFound used to represent the number of documents in a group.)

{code:xml} 
<lst name="facet_fields">
  <lst name="text" numFacetTerms="385">
    <int name="electronics">14</int>
    <int name="inc">8</int>
    <int name="2.0">5</int>
    <int name="lcd">5</int>
    <int name="memory">5</int>
  </lst>
</lst>
{code} 

Other smaller issues that appear to be outstanding:
Change code to cache the numFacetTerms/numTerms and remove the code that caches the huge term list.
Determine the parameter name: facet.nconstraints=true|false was proposed, allowing facet.count to control the rest of the behavior.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100095#comment-13100095 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Sharding will not work if you change the format of the facet results... We would need to fix sharding for this to go out... 

I am in holding pattern until a committer helps.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006805#comment-13006805 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 6:16 AM:
----------------------------------------------------------

OK this is complete.

Sample query:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=cat&rows=0&facet.numfacetterms=2&facet.limit=4

Sample output:
{code}
<?xml version="1.0" encoding="UTF-8" ?> 
<response>
  <lst name="responseHeader">
    <int name="status">0</int> 
    <int name="QTime">0</int> 
    <lst name="params">
      <str name="facet.numfacetterms">2</str> 
      <str name="facet">true</str> 
      <str name="q">*:*</str> 
      <str name="facet.limit">4</str> 
      <str name="facet.field">cat</str> 
      <str name="rows">0</str> 
    </lst>
  </lst>
  <result name="response" numFound="17" start="0" /> 
  <lst name="facet_counts">
    <lst name="facet_queries" /> 
    <lst name="facet_fields">
      <lst name="cat">
        <int name="numFacetTerms">14</int> 
        <lst name="counts">
          <int name="electronics">14</int> 
          <int name="memory">3</int> 
          <int name="connector">2</int> 
          <int name="graphics card">2</int> 
        </lst>
      </lst>
    </lst>
    <lst name="facet_dates" /> 
    <lst name="facet_ranges" /> 
  </lst>
  </response>
{code}

In Json:

{code}
"facet_fields":{"cat":["numFacetTerms",14,"counts",["electronics",14,"memory",3,"connector",2,"graphics card",2]]},"facet_dates":{},"facet_ranges":{}}}
{code}

      was (Author: billnbell):
    OK this is complete.

Sample query:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=cat&rows=0&facet.numfacetterms=2&facet.limit=4

Sample output:
{code}
  <?xml version="1.0" encoding="UTF-8" ?> 
- <response>
- <lst name="responseHeader">
  <int name="status">0</int> 
  <int name="QTime">0</int> 
- <lst name="params">
  <str name="facet.numfacetterms">2</str> 
  <str name="facet">true</str> 
  <str name="q">*:*</str> 
  <str name="facet.limit">4</str> 
  <str name="facet.field">cat</str> 
  <str name="rows">0</str> 
  </lst>
  </lst>
  <result name="response" numFound="17" start="0" /> 
- <lst name="facet_counts">
  <lst name="facet_queries" /> 
- <lst name="facet_fields">
- <lst name="cat">
  <int name="numFacetTerms">14</int> 
- <lst name="counts">
  <int name="electronics">14</int> 
  <int name="memory">3</int> 
  <int name="connector">2</int> 
  <int name="graphics card">2</int> 
  </lst>
  </lst>
  </lst>
  <lst name="facet_dates" /> 
  <lst name="facet_ranges" /> 
  </lst>
  </response>
{code}

In Json:

{code}
"facet_fields":{"cat":["numFacetTerms",14,"counts",["electronics",14,"memory",3,"connector",2,"graphics card",2]]},"facet_dates":{},"facet_ranges":{}}}
{code}
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Guna C (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081906#comment-13081906 ] 

Guna C commented on SOLR-2242:
------------------------------

Hi Bill
I wanted to add that this is a great patch.  Provides a way to analyze which search terms are effective without requiring to retrieve all the docs themselves.  I was looking for a patch for 3.3.0.  Does the latest one work?  
Thanks
-guna

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070935#comment-13070935 ] 

Chris Male commented on SOLR-2242:
----------------------------------

Having walked through the SimpleFacet codebase, I see PerSegmentSingleValuedFaceting has already introduced a FacetCollector.  I think we should take this and make it used throughout all the different faceting 'Strategies'.  That way we can push the counting of constraints into the Collector.

I've also thought about the distribution issue.  The simplest option seems to be to return the max constraint count taken from all the shards.  With this, no matter if shards have distinct or overlapping constraints sets, clients can alway see this as the minimum number of constraints that do exist.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048480#comment-13048480 ] 

Lance Norskog commented on SOLR-2242:
-------------------------------------

There is a lot of complexity here, and having a bunch of orthogonal parameters is not quite enough. Looking at everything around facets, and group collapse, and the join trick, the Solr query syntax looks like the database world right before SQL. 

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006747#comment-13006747 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

I am pretty new to patching stuff. Can I get some sort of committer to
give me feedback?

I would also LOVE to get this in the TRUNK.






> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242.shard.patch

New patch ready for commit?

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026124#comment-13026124 ] 

Lance Norskog edited comment on SOLR-2242 at 4/28/11 5:33 AM:
--------------------------------------------------------------

Putting up or shutting up :)

This splits apart whether to count terms v.s. whether to count docs per term. They are independent concepts.

Instead of 'numFacetTerms=0/1/2' it is 'numTerms=true/false'.
if you set 'numTerms=true', it counts terms.
If you set facet.limit=0, it does not do the facet search. It does not count docs per term.
If you set 'numTerms=false' and 'facet.limit=0', it does nothing.

'numFacetTerms' is redundant- we know it's all about facets. Thus, 'numTerms'.


      was (Author: lancenorskog):
    Putting up or shutting up :)

This splits apart whether to count terms v.s. whether to count docs per term. They are independent concepts.

Instead of 'numFacetTerms=0/1/2' it is 'numTerms=true/false'.
if you set 'numTerms=true', it counts terms.
If you set facet.limit=0, it does not do the facet search. It does not count docs per term.
If you set 'numTerms=false' and 'facet.limit=0', it does nothing.

And, everything is called 'facet' and 'term' :)

  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Antoine Le Floc'h (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174739#comment-13174739 ] 

Antoine Le Floc'h edited comment on SOLR-2242 at 12/22/11 10:57 AM:
--------------------------------------------------------------------

To help with the specification, my use case is this: I am using this patch and possibly want to add extra infos in the facet results, and want to use sharding... Basically, this is what I have today with the patch:

<lst name="shop_id">
  <int name="numTerms">10251</int>
  <lst name="counts">
    <int name="28013756">7032406</int>
    <int name="28009589">3616625</int>
    <int name="976">3497825</int>
    <int name="635">1398780</int>
    <int name="28021713">440118</int>
    <int name="29047336">368921</int>
    <int name="411">244689</int>
  </lst>
</lst>

and I want to subclass/modify SimpleFacets to add more data for each item (since I don't see other way to do it)
                
      was (Author: alefloch):
    I am using this patch and possibly want to add extra infos in the facet results, and want to use sharding... Is there an associated patch to fix sharding ? Is it an easy fix ? Is this working out of the box in 4.0 ? Thank you.
                  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242.shard.patch

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: NumFacetTermsFacetsTest.java

Just replace this test file to fix the insanity.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "uygar bayar (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257323#comment-13257323 ] 

uygar bayar commented on SOLR-2242:
-----------------------------------

hi 
 I tried it 3.6.0 with SOLR-2242-3x_5_tests.patch but it didn't work. Results are grouped but all facets empty.

<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields"/>
<lst name="facet_numTerms"/>
<lst name="facet_dates"/>
<lst name="facet_ranges"/>

http://x.x.x.x:8985/solr/ar1/select?shards=192.168.200.202:8985/solr/ar3/,192.168.200.202:8985/solr/ar4&q=hotels&group=true&group.field=site&facet=true&f.site.facet.numFacetTerms=1&facet.mincount=1&facet.limit=-1
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024966#comment-13024966 ] 

Lance Norskog edited comment on SOLR-2242 at 4/28/11 2:01 AM:
--------------------------------------------------------------

>From the patch:
bq. {{public static final String FACET_NAMEDISTINCT = FACET + ".numFacetTerms";}}
So- in this issue, a _name_ is what everything else calls a _term_, and a _value_ is what everyone else calls a "_count of documents with *this term* in *this field*_". Please change this in the patch.






      was (Author: lancenorskog):
    From the patch:
bq. {{public static final String FACET_NAMEDISTINCT = FACET + ".numFacetTerms";}}
So- in this issue, a _name_ is what everything else calls a _term_. Please change this in the patch.





  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071147#comment-13071147 ] 

Ryan McKinley commented on SOLR-2242:
-------------------------------------

bq. The simplest option seems to be to return the max constraint count taken from all the shards

That seems reasonable -- though I think we would also want to be able to have the sum when you know that all shards have unique values.

I don't think bill is referring to the accuracy/meaning of distinct count in distributed search.  His problem is that if we change the output format, we also need to update the code that collects the various values and passes them along.  This patch just add a magic value (numFacetTerms) to the count list so that the value is handled with existing distributed response parsing.  This is a fine one-off solution, but I am -1 for adding any more magic field names to solr.  To add this feature, i think we need to bite the bullet and update the facet response format.



> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Jonathan Rochkind (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006782#comment-13006782 ] 

Jonathan Rochkind commented on SOLR-2242:
-----------------------------------------

There is clearly a semantic problem here. i call that the number of 'facet values', what you are calling a 'name' I am calilng a 'facet value'. I have no idea what you are calling a 'value', honestly.  I'm pretty sure we're talking about the same thing. I have no idea what word to use that will mean that to both of us and everyone else. 

I guess what you are calling 'number of values',if I understand properly,  I'd call 'sum of the facet counts'.  facet counts are already called facet counts. Summing them up is the sum of them. It's not a 'number of values'. (I also can't imagine any use case where you'd want a sum of facet counts; for a single-valued field with no facet.missing, the sum of the facet counts will equal the document count, numRows. In other cases it may not, and I have no idea why you'd ever want it in those cases).   But the name is less important than the functionality, I guess. (Except for that lack of establishment of consistent terminology in Solr is what leads us to this confusion). Okay, wait, numFacetTerms, is that maybe clear, 'terms', since Solr 'terms' is in fact what appear as the values/names in Solr facetting? From the wiki page for facet.field: "It will iterate over each Term in the field and generate a facet count using that Term as the constraint. "

But also perhaps I misunderstood, the functionality is of use/interest to me only if it does NOT require me to set facet.limit=-1 to get this count of distinct values/names/terms.  If I'm setting facet.limit=-1 anyway, that number is already implicit in the response, not much value added making it explicit.  What I have need of is a way to get this number without setting facet.limit=-1, since in my use cases I can have a million or more, um, values/names/terms. (Which Solr 1.4.1 with facet.method=fc handles with aplomb!).  If your patch only works if facet.limit=-1, it does not actually address my need. 


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated SOLR-2242:
----------------------------------

    Attachment: SOLR-2242.patch

Bill, thanks for the unit test. I need to look into the FieldCache issue before we go further though. Yet, I don't see a NPE here though.

I fixed some whitespace issues in the patch and refactored your impl to use a switch statement instead of if / else I think is less verbose and has less duplication but as you said thats a style issue mainly.

I will look into the FC issue and move forward here ASAP. Thanks Bill

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233224#comment-13233224 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

How does it work?

{code}
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=cat&facet.field=price&f.price.facet.numTerms=true&facet.limit=-1&f.cat.facet.numTerms=true&f.price.facet.limit=1
{code}

Parameters:

facet.numTerms or f.<field>.facet.numTerms = true (default is false) - turn on distinct counting of terms
facet.field - the field to count the terms

It creates a new section in the facet section... For example:

{code}
<lst name="facet_counts">
  <lst name="facet_queries"/>
  <lst name="facet_fields">
    <lst name="cat">
      <int name="camera">1</int>
      <int name="connector">2</int>
      <int name="copier">1</int>
      <int name="currency">4</int>
      <int name="electronics">14</int>
      <int name="graphics card">2</int>
      <int name="hard drive">2</int>
      <int name="memory">3</int>
      <int name="monitor">2</int>
      <int name="multifunction printer">1</int>
      <int name="music">1</int>
      <int name="printer">1</int>
      <int name="scanner">1</int>
      <int name="search">2</int>
      <int name="software">2</int>
    </lst>
    <lst name="price">
      <int name="0.0">3</int>
    </lst>
  </lst>
  <lst name="facet_numTerms">
    <int name="cat">15</int>
    <int name="price">14</int>
  </lst>
  <lst name="facet_dates"/>
  <lst name="facet_ranges"/>
</lst>
{code}





                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174441#comment-13174441 ] 

Erick Erickson edited comment on SOLR-2242 at 12/21/11 9:51 PM:
----------------------------------------------------------------

First step in resurrecting this. This patch should apply cleanly to trunk. It incorporates the SOLR-2242.patch from 28-June and the NumFacetTermsFacetsTest from 9-July. It accounts for the fact that things seem to have been moved around a bit. All I guarantee is that the code compiles and the NumFacetTermsFacetsTest runs from inside IntelliJ.
                
      was (Author: erickerickson):
    First step in resurrecting this. This patch should apply cleanly to trunk. It incorporates the SOLR-2242.patch from 28-June and the NumFacetTermsFacetsTest from 9-July. It accounts for the fact that things seem to have been moved around a bit.
                  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495020#comment-13495020 ] 

Bill Bell edited comment on SOLR-2242 at 11/11/12 10:35 PM:
------------------------------------------------------------

uygar,

You are not using it properly. SOLR-2242-3x_5_tests.patch  does indeed work.

http://x.x.x.x:8985/solr/ar1/select?shards=192.168.200.202:8985/solr/ar3/,192.168.200.202:8985/solr/ar4&q=hotels&group=true&group.field=site&facet=true&f.site.facet.numFacetTerms=1&facet.mincount=1&facet.limit=-1

You forgot the facet.field=site and the field is f.site.facet.numTerms=true

With sample data. Do the following.

Copy example to example2, and change jetty.xml on example2 to be port 8080.

Run this:

http://localhost:8983/solr/select?shards=localhost:8983/solr/,localhost:8080/solr/&q=*:*&rows=0&facet=true&facet.field=price&facet.numTerms=true&facet.mincount=1&facet.limit=-1


                
      was (Author: billnbell):
    uygar,

You are not using it properly. SOLR-2242-3x_5_tests.patch  does indeed work.

http://x.x.x.x:8985/solr/ar1/select?shards=192.168.200.202:8985/solr/ar3/,192.168.200.202:8985/solr/ar4&q=hotels&group=true&group.field=site&facet=true&f.site.facet.numFacetTerms=1&facet.mincount=1&facet.limit=-1

You forgot the facet.field=site

With sample data. Do the following.

Copy example to example2, and change jetty.xml on example2 to be port 8080.

Run this:

http://localhost:8983/solr/select?shards=localhost:8983/solr/,localhost:8080/solr/&q=*:*&rows=0&facet=true&facet.field=price&facet.numTerms=true&facet.mincount=1&facet.limit=-1



                  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0-ALPHA
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.1
>
>         Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> Parameters:
> facet.numTerms or f.<field>.facet.numTerms = true (default is false) - turn on distinct counting of terms
> facet.field - the field to count the terms
> It creates a new section in the facet section...
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=false&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_counts">
> <lst name="facet_queries"/>
> <lst name="facet_fields">...</lst>
> <lst name="facet_numTerms">
> <lst name="localhost:8983/solr/">
> <int name="price">14</int>
> </lst>
> <lst name="localhost:8080/solr/">
> <int name="price">14</int>
> </lst>
> </lst>
> <lst name="facet_dates"/>
> <lst name="facet_ranges"/>
> </lst>
> OR with no sharding-
> <lst name="facet_numTerms">
> <int name="price">14</int>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495020#comment-13495020 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

uygar,

You are not using it properly. SOLR-2242-3x_5_tests.patch  does indeed work.

http://x.x.x.x:8985/solr/ar1/select?shards=192.168.200.202:8985/solr/ar3/,192.168.200.202:8985/solr/ar4&q=hotels&group=true&group.field=site&facet=true&f.site.facet.numFacetTerms=1&facet.mincount=1&facet.limit=-1

You forgot the facet.field=site

With sample data. Do the following.

Copy example to example2, and change jetty.xml on example2 to be port 8080.

Run this:

http://localhost:8983/solr/select?shards=localhost:8983/solr/,localhost:8080/solr/&q=*:*&rows=0&facet=true&facet.field=price&facet.numTerms=true&facet.mincount=1&facet.limit=-1



                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0-ALPHA
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.1
>
>         Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240225#comment-13240225 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

I changed the sharing response to check the size and only return the shard name if there is a response.

{code}
<lst name="facet_numTerms">
<lst name="localhost:8983/solr"/>
</lst>

Changed to 

<lst name="facet_numTerms"/>
{code}

Also, the code for field_facets was wrong. It needs to return the name of the field even if the size is 0 or null.

See latest patch for 3x.




                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Ethan Gruber (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191185#comment-13191185 ] 

Ethan Gruber edited comment on SOLR-2242 at 1/23/12 2:50 PM:
-------------------------------------------------------------

+1 for me too.  I have been using this feature for almost a year.  I plan to upgrade to the newest patch/Solr trunk code, but the patch doesn't apply to the current trunk.  Do I have to check out the revision that dates to 12/21/11 to get this to work?

edit: nevermind, the answer is yes.  I had to check out revision 1221500 from Dec. 20.
                
      was (Author: ewg118):
    +1 for me too.  I have been using this feature for almost a year.  I plan to upgrade to the newest patch/Solr trunk code, but the patch doesn't apply to the current trunk.  Do I have to check out the revision that dates to 12/21/11 to get this to work?
                  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006767#comment-13006767 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

OK. So you like the work "constraints" instead of "namedistinct". I am okay with it.

I am going to work on this tonight.



> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072805#comment-13072805 ] 

Chris Male commented on SOLR-2242:
----------------------------------

I don't think its realistic to send back the whole list, it could be huge! Besides, in the situation where we are only doing counts we aren't going to store the list anywhere.  The distributed environment is never going to be perfect in this situation, Ryan and my suggestion is to send the minimum and maximum number of constraints there could be.  

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006808#comment-13006808 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Maybe, but I thought all params were supposed to be lower case?

I can easily change that ??

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048474#comment-13048474 ] 

Lance Norskog commented on SOLR-2242:
-------------------------------------

If I was a committer which I'm not, I would demand:
* params would be as simple as possible. 'namedistinct' would be a symbol like 'facet.method=enum'. Facets have exploded in complexity, and I can't follow how everything interlocks. The API may have to change later.
* no white-space glitches
* consistencyConsistencyConsistency. 
* there has to to be a way to use less memory when we're only pulling a count.
* unit tests. It's somewhat unfair to expect you to write all the unit tests required to make sure this does not break anything else, give that so much of facet features do not have tests.

Anyway, food calls. Hope this helps.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026863#comment-13026863 ] 

Lance Norskog commented on SOLR-2242:
-------------------------------------

bq. There is a lot of logic in getListedTermCounts() and getTermCountsLimit(). If we optimize, and just add a counter, we need to make sure the new methods are not forgotten about (test cases?). I have seen that happen numerous times.
Ayup. In fact this breaks SimpleFacetsTest. Everything in facets need tests.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235366#comment-13235366 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

https://issues.apache.org/jira/secure/attachment/12519406/SOLR-2242-solr40-2.patch is the latest patch.

                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40-2.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006774#comment-13006774 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Btw,

I hope constraints means unique names. It is different than number of
constraints. There might be a need for number of constraints, but that is
not what this ticket is for.

So, I think I am going to reject your proposed naming for mine:

{code}
Proposed:
"facet fields" : {"hgid" : {
  "missing" : 25,
  "namedistinct" : 25,
  "constraints": 1250,
  "counts" : ["constraint",10,...]
}}
{code}


Those are 2 different things.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040096#comment-13040096 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

>From rajini:


     The patch solr 2242 for getting count of distinct facet terms doesn't
work for distributedProcess

(https://issues.apache.org/jira/browse/SOLR-2242)

The error log says

HTTP ERROR 500
Problem accessing /solr/select. Reason:

    For input string: "numFacetTerms"

java.lang.NumberFormatException: For input string: "numFacetTerms"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Long.parseLong(Long.java:403)
at java.lang.Long.parseLong(Long.java:461)
at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:331)
at org.apache.solr.schema.TrieField.toInternal(TrieField.java:344)
at
org.apache.solr.handler.component.FacetComponent$DistribFieldFacet.add(FacetComponent.java:619)
at
org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponent.java:265)
at
org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:235)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)


The query I passed :
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=2&facet.field=648&facet.mincount=1&facet.limit=-1&f.2.facet.numFacetTerms=1&rows=0&shards=localhost:8983/solr,localhost:8985/solrtwo

Anyone can suggest me the changes i need to make to enable the same
funcionality for shards?

When i do it across single core.. I get the correct results. I have applied
the solr 2242 patch in solr1.4.1

Awaiting for reply

Regards,
Rajani


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-solr40-2.patch

Added Sharding
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40-2.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048471#comment-13048471 ] 

Lance Norskog commented on SOLR-2242:
-------------------------------------

Yeah, my itch started just now also :)

"Constraint" means any facet value: terms, numerical ranges, query results.

Range queries have the same situation: when I give range endpoints and a gap, I want to know how many intervals it made from the gap.That would be the analog of this count. 

I'm not saying this patch has to do range counts also, but pointing out the eventual scope of this feature. Therefore, 'numTerms' is not the word we're looking for. 'count' or 'total' seem right.

Below, both *features:{* and *popularity:{* need counts. 
 
{code}
"facet_counts":{
    "facet_queries":{
      "*:*":27},
    "facet_fields":{
      "features":[
        "facet_terms",[
          "2",7,]]}
    "facet_ranges":{
      "popularity":{
        "counts":[
          "0",3,
          "2",0,
          "4",1,
          "6",9],
        "gap":2,
        "start":0,
        "end":8}}}}
{code}


p.s.
I got the above from the example electronic shop database with this query:
[click to see|http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=0&indent=on&facet.field=popularity&facet=true&facet.numTerms=true&facet.query=*:*&wt=json&facet.range.start=0&facet.range.end=7&facet.range.gap=2&facet.range=popularity]

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242-solr40-2.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048461#comment-13048461 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Can we PLEASE commit this?  What else do we need to add?

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Antoine Le Floc'h (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178718#comment-13178718 ] 

Antoine Le Floc'h commented on SOLR-2242:
-----------------------------------------


People who need to be back-compat won't be able to use {code} &facet.numTerms=true {code}. Isn't it fair ?

About the distribution issue, maybe the distinct counter could be displayed per shard, something like:
{code}
<lst name="facet_fields">
  <lst name="shop_id">
    <lst name="numTerms"> 
      <int ip="192.168.0.100">58</int>
      <int ip="192.168.0.101">158</int>
    </lst>
    <lst name="counts">
      <int name="28013756">7032406</int>
      <int name="28009589">3616625</int>
      <int name="976">3497825</int>
      <int name="635">1398780</int>
      <int name="28021713">440118</int>
    </lst>
  </lst>
</lst>
{code}
Like this, people who don't use shards are happy, and people who do, can display what makes sense for them, waiting for better in the future. This would allow to move forward with this JIRA.

                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13058205#comment-13058205 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Thanks... If you look at my tests that I commented out, you will notice you get the Insane FieldCache usage(s) problem.

It does it every time on my PC...

This patch does not appear to gave any issues until you pull in the group issue.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026103#comment-13026103 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Lance Norskog,

What do you want it to be called? I would use a committer to take this issue on. It has several votes, and lots of downloads. People are using it successfully already.

Do you want me to switch the numFacetTerms to numFacetNames ? Anything else? I feel like we are going in circles on this issue.

{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numFacetTerms=2

<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>

{code}

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006792#comment-13006792 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 5:45 AM:
----------------------------------------------------------

I am going to use your suggestion. You will not have to set the limit. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=1

This assumes the count will be limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=2

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}

      was (Author: billnbell):
    I am going to use your suggestion. You will not have to set the limit or mincount. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=1

This assumes the count will be mincount=1, and limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=2

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061839#comment-13061839 ] 

Simon Willnauer commented on SOLR-2242:
---------------------------------------

bq. Are we ready to commit?
bill, isnt't there a test failure still on this issue related to FC? Yonik mentioned BW compat issues here and promised to comment. I will ping him again.

thanks for the patience

simon

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "James Dyer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Dyer updated SOLR-2242:
-----------------------------

    Attachment: SOLR-2242.patch

I noticed that with the original patch applied, SimpleFacetsTest would fail.  The reason is a tiny bug that affects backwards-compatibility in that this would wrap the counts with a "counts" element in the response.  This is valid if using the "namedistinct" param, but if a user doesn't specify this, it shouldn't affect old behavior.  This updated patch corrects this little issue and SimpleFacetsTest now passes. 

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026827#comment-13026827 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Also I thought you wanted to change the name to numNames? I am okay with numTerms too.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242-distinctFacet.patch)

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071156#comment-13071156 ] 

Chris Male commented on SOLR-2242:
----------------------------------

{code}
That seems reasonable – though I think we would also want to be able to have the sum when you know that all shards have unique values.
{code}

Perhaps we should return the maximum and sum of all shard counts?  That way, assuming the client knew how many shards exist, they could handle most scenarios.

{code}
I don't think bill is referring to the accuracy/meaning of distinct count in distributed search. His problem is that if we change the output format, we also need to update the code that collects the various values and passes them along. This patch just add a magic value (numFacetTerms) to the count list so that the value is handled with existing distributed response parsing. This is a fine one-off solution, but I am -1 for adding any more magic field names to solr. To add this feature, i think we need to bite the bullet and update the facet response format.
{code}

Absolutely.  I hadn't even considered the prospect of not changing the distributed response parsing.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242.shard.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: NumFacetTermsFacetsTest.java)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006758#comment-13006758 ] 

Yonik Seeley commented on SOLR-2242:
------------------------------------

bq. Not sure what "constraints" means?

It's a facet value like "HGPY0000045FD36D4000A" in your example.

bq. Would be always include this or just add it as an option?

It will require disabling certain optimizations, and should thus be optional (and off by default).

FYI, the missing I threw in is also a different way to represent the count calculated via facet.missing=true, instead of being added in with the other counts as a null key (which JSON does not support).

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-3x_2.patch

Latest 3x patch is uploaded: SOLR-2242-3x_2.patch
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_2.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244961#comment-13244961 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Ready for 3x merge. Test with:

ant test -Dtestcase=NumFacetTermsFacetsTest

                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044730#comment-13044730 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

It would be easier for Sharding to not have multiple lists... I could use some help if we want to change it - since I have not played with FacetComponent.java.

Otherwise, it would a more simpler fix to just add it and flatten the lists.

{code}
<lst name="facet_fields">
  <lst name="price">
    <int name="numFacetTerms">14</int>
    <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
  </lst>
</lst>
{code}

Not ideal, but easier for v1 ? I could also just remove numFacetTerms=2 for now.

Will only require an if statement to ignore the type check for "numFacetTerms".

Here is a patch that works with sharding.

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price

Enjoy.

Bill





> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291803#comment-13291803 ] 

Jason Rutherglen commented on SOLR-2242:
----------------------------------------

Terrance, can you post a patch to the Jira?  It makes sense to start this Jira off non-distributed, and add a distributed version in another Jira issue...
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026826#comment-13026826 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

I am not seeing the performance problem.

If you are outputting facets anyways, the loop and list is going to be called. So in that case it is as efficient as probably can be.
That is why I had the 0/1/2. I was reusing the code and just looking at the list size:

countFacetTerms.size()
counts.size()

There is a lot of logic in getListedTermCounts() and getTermCountsLimit(). If we optimize, and just add a counter, we need to make sure 
the new methods are not forgotten about (test cases?). I have seen that happen numerous times.




> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006807#comment-13006807 ] 

Otis Gospodnetic commented on SOLR-2242:
----------------------------------------

Would this be more consistent?  facet.numfacetterms => facet.numFacetTerms

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233181#comment-13233181 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Cody,

I love your suggestion. I am actually ready to work on it. 

{code}
<lst name="facet_numTerms">
   <int name="text">124</int>
</lst>
{code}

After we get it committed we should then fix the shard issues as per SOLR-3134.

We can also create a new JIRA ticket for that. 

Everyone agreed?

I will do it on SOLR 4.0 and back port to 3.5.



                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072827#comment-13072827 ] 

Chris Male commented on SOLR-2242:
----------------------------------

I really want to avoid having to load the list just to calculate the counts, it seems unnecessary and a waste of memory.  I think we should start simple and implement what you originally suggested.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176065#comment-13176065 ] 

Erick Erickson commented on SOLR-2242:
--------------------------------------

Just to be clear. I'm not volunteering to actually *implement* this patch. I'll gladly guide it through the process if someone wants to work on it and address the concerns raised. And I'll keep prodding it along and try to keep it from dying on the vine, and certainly volunteer to test various incarnations. Or I'll try to kill it if it comes to that.

There are two open issues really, of which the most pressing seems to be back-compat. Cody's initial suggestion doesn't work with all the various response formats. Working out a way to change the response format without breaking back-compat seems like a worthy goal in itself, but does that mean we need to create another JIRA for that and make this JIRA dependent on the new one? Note that this is the inverse of my original point <3>, I'm suggesting we fix the back-compat issue before we address this one. I have no real clue yet how to approach that mind you.

Again, I want a clear goal in mind before we put work into *any* solution.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006806#comment-13006806 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 6:06 AM:
----------------------------------------------------------

v2 of the release based on feedback.

Note: SOLR-2242-distinctFacet.patch not needed (left for history)

      was (Author: billnbell):
    v2 of the release based on feedback.
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Antoine Le Floc'h (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218232#comment-13218232 ] 

Antoine Le Floc'h commented on SOLR-2242:
-----------------------------------------

About the distribution issue, it looks like https://issues.apache.org/jira/browse/SOLR-3134 has some similar thinking as my post from 03/Jan/12 : show the info per shard. Even though the counter info cannot be aggregated across shards, knowing what the counter is for each shard would allow each user to use the info as he wants. It would work in single shard too.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Comment: was deleted

(was: v2 of the release based on feedback.

Note: SOLR-2242-distinctFacet.patch not needed (left for history))

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: [jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by Bill Bell <bi...@gmail.com>.
Did this work before the patch ? This patch only changes facet.field and not ranges. Send the whole URL you are sensing to Solr.

Bill Bell
Sent from mobile


On Aug 10, 2011, at 1:55 AM, "Trinh Trung Kien (JIRA)" <ji...@apache.org> wrote:

> 
>    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082216#comment-13082216 ] 
> 
> Trinh Trung Kien commented on SOLR-2242:
> ----------------------------------------
> 
> Hi,
> 
> I apply the patch using SOLR 4.0 revision 1140474. The patch seem working OK but i observe several issues:
> 
> - I have one field indexed as integer:
> <field name="cell_id" type="integer" indexed="true" stored="true"/>
> 
> When I search for cell_id:[900 TO 1000], there is no result (actually I have lots of data with cell_id between 900 to 1000)
> Then I search for cell_id:[1000 TO *], this should return data which have cell_id>=1000, however they return me all the records, the condition seems don't have that meaning.
> 
> Can you confirm that i'm using the correct version and revision?
> 
> here is my svn info for the trunk:
> 
> URL: http://svn.apache.org/repos/asf/lucene/dev/trunk
> Repository Root: http://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1140474
> Node Kind: directory
> Schedule: normal
> Last Changed Author: chrism
> Last Changed Rev: 1140408
> Last Changed Date: 2011-06-27 21:52:53 -0500 (Mon, 27 Jun 2011)
> 
> 
> 
> 
> 
> 
>> Get distinct count of names for a facet field
>> ---------------------------------------------
>> 
>>                Key: SOLR-2242
>>                URL: https://issues.apache.org/jira/browse/SOLR-2242
>>            Project: Solr
>>         Issue Type: New Feature
>>         Components: Response Writers
>>   Affects Versions: 4.0
>>           Reporter: Bill Bell
>>           Assignee: Simon Willnauer
>>           Priority: Minor
>>            Fix For: 4.0
>> 
>>        Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>> 
>> 
>> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
>> The feature is called "namedistinct". Here is an example:
>> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
>> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
>> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
>> This currently only works on facet.field.
>> {code}
>> <lst name="facet_fields">
>>  <lst name="price">
>>    <int name="numFacetTerms">14</int>
>>    <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>>  </lst>
>> </lst>
>> {code} 
>> Several people use this to get the group.field count (the # of groups).
> 
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Trinh Trung Kien (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082216#comment-13082216 ] 

Trinh Trung Kien commented on SOLR-2242:
----------------------------------------

Hi,

I apply the patch using SOLR 4.0 revision 1140474. The patch seem working OK but i observe several issues:

- I have one field indexed as integer:
<field name="cell_id" type="integer" indexed="true" stored="true"/>

When I search for cell_id:[900 TO 1000], there is no result (actually I have lots of data with cell_id between 900 to 1000)
Then I search for cell_id:[1000 TO *], this should return data which have cell_id>=1000, however they return me all the records, the condition seems don't have that meaning.

Can you confirm that i'm using the correct version and revision?

here is my svn info for the trunk:

URL: http://svn.apache.org/repos/asf/lucene/dev/trunk
Repository Root: http://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1140474
Node Kind: directory
Schedule: normal
Last Changed Author: chrism
Last Changed Rev: 1140408
Last Changed Date: 2011-06-27 21:52:53 -0500 (Mon, 27 Jun 2011)






> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson reassigned SOLR-2242:
------------------------------------

    Assignee: Erick Erickson  (was: Simon Willnauer)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235365#comment-13235365 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

I added sharding as discussed by Antoine.

{code}
<lst name="facet_numTerms">
<lst name="http://localhost:8983/solr">
<int name="price">14</int>
<int name="cat">15</int>
</lst>
<lst name="http://localhost:8081/solr">
<int name="price">23</int>
<int name="cat">3</int>
</lst>
</lst>
{code}

Example call

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:8081/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price&facet.field=cat


                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40-2.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer reassigned SOLR-2242:
-------------------------------------

    Assignee: Simon Willnauer

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242.solr35.patch

PAtch for SOLR 3.5 branch. There is something wrong with branch_3x but this one commits and is on 3.5 

http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_5
Last Changed Rev: 1207561
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40-2.patch, SOLR-2242-solr40-3.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006792#comment-13006792 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

I am going to use your suggestion. You will not have to set the limit or mincount. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=true

This assumes the count will be mincount=1, and limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=both

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="hgid">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006805#comment-13006805 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 6:20 AM:
----------------------------------------------------------

OK this is complete.

Sample query:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=cat&rows=0&facet.numFacetTerms=2&facet.limit=4

Sample output:
{code}
<?xml version="1.0" encoding="UTF-8" ?> 
<response>
  <lst name="responseHeader">
    <int name="status">0</int> 
    <int name="QTime">0</int> 
    <lst name="params">
      <str name="facet.numfacetterms">2</str> 
      <str name="facet">true</str> 
      <str name="q">*:*</str> 
      <str name="facet.limit">4</str> 
      <str name="facet.field">cat</str> 
      <str name="rows">0</str> 
    </lst>
  </lst>
  <result name="response" numFound="17" start="0" /> 
  <lst name="facet_counts">
    <lst name="facet_queries" /> 
    <lst name="facet_fields">
      <lst name="cat">
        <int name="numFacetTerms">14</int> 
        <lst name="counts">
          <int name="electronics">14</int> 
          <int name="memory">3</int> 
          <int name="connector">2</int> 
          <int name="graphics card">2</int> 
        </lst>
      </lst>
    </lst>
    <lst name="facet_dates" /> 
    <lst name="facet_ranges" /> 
  </lst>
  </response>
{code}

In Json:

{code}
"facet_fields":{"cat":["numFacetTerms",14,"counts",["electronics",14,"memory",3,"connector",2,"graphics card",2]]},"facet_dates":{},"facet_ranges":{}}}
{code}

      was (Author: billnbell):
    OK this is complete.

Sample query:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=cat&rows=0&facet.numfacetterms=2&facet.limit=4

Sample output:
{code}
<?xml version="1.0" encoding="UTF-8" ?> 
<response>
  <lst name="responseHeader">
    <int name="status">0</int> 
    <int name="QTime">0</int> 
    <lst name="params">
      <str name="facet.numfacetterms">2</str> 
      <str name="facet">true</str> 
      <str name="q">*:*</str> 
      <str name="facet.limit">4</str> 
      <str name="facet.field">cat</str> 
      <str name="rows">0</str> 
    </lst>
  </lst>
  <result name="response" numFound="17" start="0" /> 
  <lst name="facet_counts">
    <lst name="facet_queries" /> 
    <lst name="facet_fields">
      <lst name="cat">
        <int name="numFacetTerms">14</int> 
        <lst name="counts">
          <int name="electronics">14</int> 
          <int name="memory">3</int> 
          <int name="connector">2</int> 
          <int name="graphics card">2</int> 
        </lst>
      </lst>
    </lst>
    <lst name="facet_dates" /> 
    <lst name="facet_ranges" /> 
  </lst>
  </response>
{code}

In Json:

{code}
"facet_fields":{"cat":["numFacetTerms",14,"counts",["electronics",14,"memory",3,"connector",2,"graphics card",2]]},"facet_dates":{},"facet_ranges":{}}}
{code}
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-3x_3.patch

Latest 3x patch. SOLR-2242-3x_3.patch
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_3.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-distinctFacet.patch

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071156#comment-13071156 ] 

Chris Male edited comment on SOLR-2242 at 7/26/11 3:28 PM:
-----------------------------------------------------------

{quote}
That seems reasonable – though I think we would also want to be able to have the sum when you know that all shards have unique values.
{quote}

Perhaps we should return the maximum and sum of all shard counts?  That way, assuming the client knew how many shards exist, they could handle most scenarios.

{quote}
I don't think bill is referring to the accuracy/meaning of distinct count in distributed search. His problem is that if we change the output format, we also need to update the code that collects the various values and passes them along. This patch just add a magic value (numFacetTerms) to the count list so that the value is handled with existing distributed response parsing. This is a fine one-off solution, but I am -1 for adding any more magic field names to solr. To add this feature, i think we need to bite the bullet and update the facet response format.
{quote}

Absolutely.  I hadn't even considered the prospect of not changing the distributed response parsing.

      was (Author: cmale):
    {code}
That seems reasonable – though I think we would also want to be able to have the sum when you know that all shards have unique values.
{code}

Perhaps we should return the maximum and sum of all shard counts?  That way, assuming the client knew how many shards exist, they could handle most scenarios.

{code}
I don't think bill is referring to the accuracy/meaning of distinct count in distributed search. His problem is that if we change the output format, we also need to update the code that collects the various values and passes them along. This patch just add a magic value (numFacetTerms) to the count list so that the value is handled with existing distributed response parsing. This is a fine one-off solution, but I am -1 for adding any more magic field names to solr. To add this feature, i think we need to bite the bullet and update the facet response format.
{code}

Absolutely.  I hadn't even considered the prospect of not changing the distributed response parsing.
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Antoine Le Floc'h (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174739#comment-13174739 ] 

Antoine Le Floc'h commented on SOLR-2242:
-----------------------------------------

I am using this patch and possibly want to add extra infos in the facet results, and want to use sharding... Is there an associated patch to fix sharding ? Is it an easy fix ? Is this working out of the box in 4.0 ? Thank you.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068892#comment-13068892 ] 

Chris Male commented on SOLR-2242:
----------------------------------

I'm just jumping into this issue and considering the problem of loading all constraints just to get their size (or in fact, not wanting to do this).  Is there scope in the SimpleFacets to have some sort of 'Collector' idea added? That way it would be easy to choose if we want to collect the constraints, their counts and the total number of constraints, or whether we just want to total number.

Does anybody have any thoughts on the distribution issue?

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072732#comment-13072732 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

To make this work right with distribution, if seems that it might be more complicated... Wouldn't you have to send the full list of facet terms, consolidate them, and then loop to get the distinct number? That is why I originally sent the WHOLE list of facets, and just added the magic number to the end. 

One machine:

male: 10000
numFacetTerms: 1

Another machine:

female: 7000
male: 500
numFacetTerms: 2

The numFacetTerms that we want is 2. Since if you combined them and looped you get 2:

male: 10500
female: 7000
numFacetTerms: 2

If we add numFacetTerms you get 1+2 = 3.

The other 2 are easier:

distribMaxTerms: 2
distribSumTerms: 3

This is not ideal but may be acceptable, the perfect solution is to send the whole list, dedupe them, and then count.... Thoughts?



> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174184#comment-13174184 ] 

Yonik Seeley commented on SOLR-2242:
------------------------------------

bq. I'm also slightly anti the min/max idea. I'm not sure what value there is in telling someone "there are between 10,000 and 90,000 distinct values".

I think we could come up with a pretty good estimate (but we should tell them it's an estimate somehow).  Anyway, that could optionally be handled in a different issue.

bq. 2> back compat. Cody's suggestion seems to be the slickest in terms of not breaking things, but we use attributes in just a few places, are there reasons NOT to do it that way? Or does this mess up JSON, PHP, etc?

Yes, it messes up JSON, binary format, etc.  We'd need to figure out how to add attributes into our data model (that gets sent to response writers) in a generic way.

bq. 3> Possibly add a new JIRA for changing the facet response format to be tolerant of sub-fields, but don't do that here.

Not sure how that's possible... it's either more magic field names in with the individual constraints, or the facet response format has got to change.

                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Nguyen Kien Trung (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095135#comment-13095135 ] 

Nguyen Kien Trung edited comment on SOLR-2242 at 9/1/11 5:50 AM:
-----------------------------------------------------------------

I'm using Solr 3.2. Instead of patching, I extend {{SimpleFacets}} and {{FacetComponent}}, apply the changes mentioned in [^SOLR-2242.solr3.1.patch] with a small fix ([^SOLR-2242.solr3.1-fix.patch]).
{code}
int offset = params.getFieldInt(facetValue, FacetParams.FACET_OFFSET, 0);
....
resCount.add("numTerms", counts.size() + offset);
{code}

as {{counts}} contains list of terms started from the given {{offset}}

It accepts param {{facet.numTerms=true|false}} and produce the output
{code}
<lst name="facet_fields">
   <lst name="color">
      <int name="numTerms">124</int>
      <lst name="counts" />
          <int name="red">4</int>
          <int name="blue">3</int>
      </lst>
   </lst>
</lst>
{code}
Not yet tested with sharding

      was (Author: trung):
    I'm using Solr 3.2. Instead of patching, I extend {{SimpleFacets}} and {{FacetComponent}}, apply the changes mentioned in [^SOLR-2242.solr3.1.patch] with a small fix ([^SOLR\-2242.solr3.1-fix.patch]).
{code}
int offset = params.getFieldInt(facetValue, FacetParams.FACET_OFFSET, 0);
....
resCount.add("numTerms", counts.size() + offset);
{code}

as {{counts}} contains list of terms started from the given {{offset}}

It accepts param {{facet.numTerms=true|false}} and produce the output
{code}
<lst name="facet_fields">
   <lst name="color">
      <int name="numTerms">124</int>
      <lst name="counts" />
          <int name="red">4</int>
          <int name="blue">3</int>
      </lst>
   </lst>
</lst>
{code}
Not yet tested with sharding
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048967#comment-13048967 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Lance,

This patch just takes the # of lines coming out of the facet section for a field and tells you how many you have.

It does not do anything to change the facet, or deal with white space, or anything complicated.

This is a simple counter.

Bill


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-3x_4.patch

Fixed one of the tests that was failing.
SOLR-2242-3x_4.patch

                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_4.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055158#comment-13055158 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

{code}
junit-sequential:
    [junit] Testsuite: org.apache.solr.request.NumFacetTermsFacetsTest
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.48 sec
    [junit] 
{code}

I fixed the NamedList() generic too.




> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "bronco (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120635#comment-13120635 ] 

bronco commented on SOLR-2242:
------------------------------

Will there also be a solution for 3.5 to get the correct numFound results?
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006809#comment-13006809 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

I am changing it. Since there is one example of upper/lower.

facet.enum.cache.minDf



> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Comment: was deleted

(was: Maybe, but I thought all params were supposed to be lower case?

I can easily change that ??)

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061785#comment-13061785 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Are we ready to commit?

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006779#comment-13006779 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 6:06 AM:
----------------------------------------------------------

No actually namedistinct is not the number of values. It is the number of names.

{code}
- <lst name="facet_fields">
- <lst name="hgid">
   <int name="HGPY0000045FD36D4000A">1</int>
   <int name="HGPY00000FBC6690453A9">1</int>
   <int name="HGPY00001E44ED6C4FB3B">1</int>
   <int name="HGPY00001FA631034A1B8">1</int>
   <int name="HGPY00003317ABAC43B48">1</int>
   <int name="HGPY00003A17B2294CB5A">5</int>
   <int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
   </lst>
{code}

Becomes:

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="namedistinct">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}


      was (Author: billnbell):
    No actually namedistinct is not the number of values. It is the number of names.

{code}
- <lst name="facet_fields">
- <lst name="hgid">
   <int name="HGPY0000045FD36D4000A">1</int>
   <int name="HGPY00000FBC6690453A9">1</int>
   <int name="HGPY00001E44ED6C4FB3B">1</int>
   <int name="HGPY00001FA631034A1B8">1</int>
   <int name="HGPY00003317ABAC43B48">1</int>
   <int name="HGPY00003A17B2294CB5A">5</int>
   <int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
   </lst>
{code}

Becomes:

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="namedistinct">7</int>  <!-- this is not 11 -->
   <lst name="hgid">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}

  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Cody Young (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183627#comment-13183627 ] 

Cody Young commented on SOLR-2242:
----------------------------------

Had another idea that maintains backwards compatibility. We could add a new facet section:

{code:xml} 
<lst name="facet_fields">
  <lst name="text">
    <int name="electronics">14</int>
    <int name="inc">8</int>
    <int name="2.0">5</int>
    <int name="lcd">5</int>
    <int name="memory">5</int>
  </lst>
</lst>
<lst name="facet_numTerms">
   <int name="text">124</int>
</lst>
{code}

facet.query, facet.date and facet.range all show up in a different section, what about facet.numTerms.

That brings up an interesting question actually, we'll want to control this on a per facet field basis, what about something like facet.numTerms=FieldName. That brings it more in line with facet.date and facet.range.

Cody
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Amber Duque (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507488#comment-13507488 ] 

Amber Duque commented on SOLR-2242:
-----------------------------------

I have a question on the SOLR-2242-solr40-3.patch.
I have applied this patch on top of the Solr 4.0 release (http://svn.apache.org/repos/asf/lucene/dev/tags/ - lucene_solr_4_0_0).
The patch builds fine, but several solr unit tests fail:

Tests with failures:
  - org.apache.solr.request.TestFaceting.testFacets
  - org.apache.solr.request.TestFaceting.testRegularBig
  - org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch
  - org.apache.solr.TestDistributedSearch.testDistribSearch
  - org.apache.solr.TestDistributedGrouping.testDistribSearch
  - org.apache.solr.request.SimpleFacetsTest (suite)
  - org.apache.solr.TestGroupingSearch.testRandomGrouping
  - org.apache.solr.TestGroupingSearch.testGroupingGroupedBasedFaceting
  - org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch

Do the unit tests pass successfully for anyone (for this patch applied on top of the solr 4.0 release)?

Thanks!
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0-ALPHA
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.1
>
>         Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> Parameters:
> facet.numTerms or f.<field>.facet.numTerms = true (default is false) - turn on distinct counting of terms
> facet.field - the field to count the terms
> It creates a new section in the facet section...
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=false&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_counts">
> <lst name="facet_queries"/>
> <lst name="facet_fields">...</lst>
> <lst name="facet_numTerms">
> <lst name="localhost:8983/solr/">
> <int name="price">14</int>
> </lst>
> <lst name="localhost:8080/solr/">
> <int name="price">14</int>
> </lst>
> </lst>
> <lst name="facet_dates"/>
> <lst name="facet_ranges"/>
> </lst>
> OR with no sharding-
> <lst name="facet_numTerms">
> <int name="price">14</int>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Description: 
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.


The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price

This currently only works on facet.field.

{code}

<lst name="facet_fields">
  <lst name="price">
    <int name="numFacetTerms">14</int>
    <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
  </lst>
</lst>

{code} 

Several people use this to get the group.field count (the # of groups).



  was:
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price

Here is an example on field "hgid" (without namedistinct):

{code}
- <lst name="facet_fields">
- <lst name="hgid">
  <int name="HGPY0000045FD36D4000A">1</int> 
  <int name="HGPY00000FBC6690453A9">1</int> 
  <int name="HGPY00001E44ED6C4FB3B">1</int> 
  <int name="HGPY00001FA631034A1B8">1</int> 
  <int name="HGPY00003317ABAC43B48">1</int> 
  <int name="HGPY00003A17B2294CB5A">5</int> 
  <int name="HGPY00003ADD2B3D48C39">1</int> 
  </lst>
  </lst>
{code}

With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).

{code}
- <lst name="facet_fields">
- <lst name="hgid">
  <int name="_count_">7</int> 
  </lst>
  </lst>
{code}
This works actually really good to get total number of fields for a group.field=hgid. Enjoy!


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242-3x_3.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_4.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024966#comment-13024966 ] 

Lance Norskog commented on SOLR-2242:
-------------------------------------

>From the patch:
bq. {{public static final String FACET_NAMEDISTINCT = FACET + ".numFacetTerms";}}
So- in this issue, a _name_ is what everything else calls a _term_. Please change this in the patch.






> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated SOLR-2242:
----------------------------------

    Comment: was deleted

(was: 
I am out of the office on vacation, I will return Monday July 11. I will not be checking email.

For urgent Systems Department business, please contact Mercy Anaba, manaba@jhu.edu,        (410) 516-5306.
)

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Jonathan Rochkind (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006776#comment-13006776 ] 

Jonathan Rochkind commented on SOLR-2242:
-----------------------------------------

If the naming is the sticking point. So the value here is the total count of facet values, the number of facet values you'd get if you did facet.limit=-1, but without the need to assemble every facet value in memory and send it accross the wire. This is quite analagous to numFound in the main response, the total number of documents matching your query that you'd get if you set rows=-1, but without needing actually assemble all those and send em accross the wire. Is there some way to use this parallelism in the name of the total count of facet values?  numFacetsFound? 


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240240#comment-13240240 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Found a bug and attaching new patch.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_2.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054932#comment-13054932 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Just so you know I have been using the original patch in production for over 5 months. I would say that the original one is tested.

But now that we are changing it, I agree that we need more coverage.

That will be my #1 priority.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Description: 
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price

Here is an example on field "hgid" (without namedistinct):

{code}
- <lst name="facet_fields">
- <lst name="hgid">
  <int name="HGPY0000045FD36D4000A">1</int> 
  <int name="HGPY00000FBC6690453A9">1</int> 
  <int name="HGPY00001E44ED6C4FB3B">1</int> 
  <int name="HGPY00001FA631034A1B8">1</int> 
  <int name="HGPY00003317ABAC43B48">1</int> 
  <int name="HGPY00003A17B2294CB5A">5</int> 
  <int name="HGPY00003ADD2B3D48C39">1</int> 
  </lst>
  </lst>
{code}

With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).

{code}
- <lst name="facet_fields">
- <lst name="hgid">
  <int name="_count_">7</int> 
  </lst>
  </lst>
{code}
This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

  was:
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1

Here is an example on field "hgid" (without namedistinct):

{code}
- <lst name="facet_fields">
- <lst name="hgid">
  <int name="HGPY0000045FD36D4000A">1</int> 
  <int name="HGPY00000FBC6690453A9">1</int> 
  <int name="HGPY00001E44ED6C4FB3B">1</int> 
  <int name="HGPY00001FA631034A1B8">1</int> 
  <int name="HGPY00003317ABAC43B48">1</int> 
  <int name="HGPY00003A17B2294CB5A">5</int> 
  <int name="HGPY00003ADD2B3D48C39">1</int> 
  </lst>
  </lst>
{code}

With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).

{code}
- <lst name="facet_fields">
- <lst name="hgid">
  <int name="_count_">7</int> 
  </lst>
  </lst>
{code}
This works actually really good to get total number of fields for a group.field=hgid. Enjoy!


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR.2242.v2.patch

New ver

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Jonathan Rochkind (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174179#comment-13174179 ] 

Jonathan Rochkind commented on SOLR-2242:
-----------------------------------------

I would find this feature valuable even if it simply did not work at all 
on a distributed index. (Refusing to return a value rather than 
returning a known incorrect value would seem like the right way to go).  
Because my index is not distributed, and I would find this feature 
valuable, heh.

I don't know if Solr currently has any policies against committing 
features that can't work on distributed, but personally my 'vote' would 
be doing that here, with clear documentation that it doesn't work on 
distributed (and the hope that future enhancements may make it more 
feasible to do so, as Erick suggests may possibly maybe happen).


                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055155#comment-13055155 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

I think it has to do with a NPE in group ion 4.0 it fails on other code. Nothing to do with this patch.

{code}

  assertQ("check group and facet counts with numFacetTerms=1",
            req("q", "id:[1 TO 6]"
                ,"indent", "on"
                ,"facet", "true"
                ,"group", "true"
                ,"group.field", "hgid_i1"
                ,"f.hgid_i1.facet.limit", "-1"
                ,"f.hgid_i1.facet.mincount", "1"
                ,"f.hgid_i1.facet.numFacetTerms", "1"
                ,"facet.field", "hgid_i1"
                )
            ,"*[count(//arr[@name='groups'])=1]"
            ,"*[count(//lst[@name='facet_fields']/lst[@name='hgid_i1']/int)=1]" // there are 1 unique items
            ,"//lst[@name='hgid_i1']/int[@name='numFacetTerms'][.='4']"
            );

{code}

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055151#comment-13055151 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

OK. Here are some test cases.

I am getting a weird error on running it: ant -Dtestcase=NumFacetTermsFacetsTest test

{code}
junit-sequential:
    [junit] Testsuite: org.apache.solr.request.NumFacetTermsFacetsTest
    [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 4.072 sec
    [junit] 
    [junit] ------------- Standard Error -----------------
    [junit] NOTE: reproduce with: ant test -Dtestcase=NumFacetTermsFacetsTest -Dtestmethod=testNumFacetTermsFacetCounts -Dtests.seed=3921835369594659663:-3219730304883530389
    [junit] *** BEGIN org.apache.solr.request.NumFacetTermsFacetsTest.testNumFacetTermsFacetCounts: Insane FieldCache usage(s) ***
    [junit] SUBREADER: Found caches for descendants of DirectoryReader(segments_3 _0(4.0):C6)+hgid_i1
    [junit] 	'DirectoryReader(segments_3 _0(4.0):C6)'=>'hgid_i1',class org.apache.lucene.search.FieldCache$DocTermsIndex,org.apache.lucene.search.cache.DocTermsIndexCreator@603bb3eb=>org.apache.lucene.search.cache.DocTermsIndexCreator$DocTermsIndexImpl#1026179434 (size =~ 372 bytes)
    [junit] 	'org.apache.lucene.index.SegmentCoreReaders@7e8905bd'=>'hgid_i1',int,org.apache.lucene.search.cache.IntValuesCreator@30781822=>org.apache.lucene.search.cache.CachedArray$IntValues#291172425 (size =~ 92 bytes)
    [junit] 
    [junit] *** END org.apache.solr.request.NumFacetTermsFacetsTest.testNumFacetTermsFacetCounts: Insane FieldCache usage(s) ***
    [junit] ------------- ---------------- ---------------
    [junit] Testcase: testNumFacetTermsFacetCounts(org.apache.solr.request.NumFacetTermsFacetsTest):	FAILED
    [junit] org.apache.solr.request.NumFacetTermsFacetsTest.testNumFacetTermsFacetCounts: Insane FieldCache usage(s) found expected:<0> but was:<1>
    [junit] junit.framework.AssertionFailedError: org.apache.solr.request.NumFacetTermsFacetsTest.testNumFacetTermsFacetCounts: Insane FieldCache usage(s) found expected:<0> but was:<1>
    [junit] 	at org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:725)
    [junit] 	at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:620)
    [junit] 	at org.apache.solr.SolrTestCaseJ4.tearDown(SolrTestCaseJ4.java:96)
    [junit] 	at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1430)
    [junit] 	at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1348)
    [junit] 
    [junit] 
    [junit] Test org.apache.solr.request.NumFacetTermsFacetsTest FAILED

{code}

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006775#comment-13006775 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Btw,

I hope constraints means unique names. It is different than number of
constraints. There might be a need for number of constraints, but that is
not what this ticket is for.

So, I think I am going to reject your proposed naming for mine:

{code}
Proposed:
"facet fields" : {"hgid" : {
  "missing" : 25,
  "namedistinct" : 25,
  "constraints": 1250,
  "counts" : ["constraint",10,...]
}}
{code}


Those are 2 different things.







> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053113#comment-13053113 ] 

Simon Willnauer commented on SOLR-2242:
---------------------------------------

bq. New patch ready for commit?

bill, I still see lots of whitespace / indentation problems  in that latest patch. Anyway I looked at it and I wonder if we could restructure this a little like we could first check if termList != null and do all the cases there and if termList == null we get the TermCountsLimit that would remove all the redundant getTermCountsLimit / getListedTermCounts calls. Like the termList==null case seems very easy and straight forward:
{code}
           if (termList != null) {
            NamedList<Integer> counts = getListedTermCounts(facetValue, termList);
            switch (numFacetTerms) {
            case COUNTS:
              final NamedList<Integer> resCount = new NamedList<Integer>();
              counts = resCount;
            case COUNTS_AND_VALUES:
              counts.add("numFacetTerms", counts.size());
              break;
            }
            res.add(key, counts);
          } else {
            ...
{code}

yet, its hard to refactor this without a single test (note, there might be a bug). I would be really happy to see a test-case for this that tests all the variations.
Regarding the constants, I think the default case should be a constant too. If you use NamedList can you make sure you put the right generic to it if possible, otherwise my IDE goes wild and adds warnings all over the place. In your case NamedList<Integer> works fine.

simon

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021948#comment-13021948 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

OK how do we get this committed?

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR.2242.v2.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40-2.patch, SOLR-2242-solr40-3.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006805#comment-13006805 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 6:10 AM:
----------------------------------------------------------

OK this is complete.

Sample query:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=cat&rows=0&facet.numfacetterms=2&facet.limit=4

Sample output:
{code}
  <?xml version="1.0" encoding="UTF-8" ?> 
- <response>
- <lst name="responseHeader">
  <int name="status">0</int> 
  <int name="QTime">0</int> 
- <lst name="params">
  <str name="facet.numfacetterms">2</str> 
  <str name="facet">true</str> 
  <str name="q">*:*</str> 
  <str name="facet.limit">4</str> 
  <str name="facet.field">cat</str> 
  <str name="rows">0</str> 
  </lst>
  </lst>
  <result name="response" numFound="17" start="0" /> 
- <lst name="facet_counts">
  <lst name="facet_queries" /> 
- <lst name="facet_fields">
- <lst name="cat">
  <int name="numFacetTerms">14</int> 
- <lst name="counts">
  <int name="electronics">14</int> 
  <int name="memory">3</int> 
  <int name="connector">2</int> 
  <int name="graphics card">2</int> 
  </lst>
  </lst>
  </lst>
  <lst name="facet_dates" /> 
  <lst name="facet_ranges" /> 
  </lst>
  </response>
{code}

In Json:

{code}
"facet_fields":{"cat":["numFacetTerms",14,"counts",["electronics",14,"memory",3,"connector",2,"graphics card",2]]},"facet_dates":{},"facet_ranges":{}}}
{code}

      was (Author: billnbell):
    OK this is complete.

Sample query:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=cat&rows=0&facet.numfacetterms=2&facet.limit=4

Sample output:
{code}
  <?xml version="1.0" encoding="UTF-8" ?> 
- <response>
- <lst name="responseHeader">
  <int name="status">0</int> 
  <int name="QTime">0</int> 
- <lst name="params">
  <str name="facet.numfacetterms">2</str> 
  <str name="facet">true</str> 
  <str name="q">*:*</str> 
  <str name="facet.limit">4</str> 
  <str name="facet.field">cat</str> 
  <str name="rows">0</str> 
  </lst>
  </lst>
  <result name="response" numFound="17" start="0" /> 
- <lst name="facet_counts">
  <lst name="facet_queries" /> 
- <lst name="facet_fields">
- <lst name="cat">
  <int name="numFacetTerms">14</int> 
- <lst name="counts">
  <int name="electronics">14</int> 
  <int name="memory">3</int> 
  <int name="connector">2</int> 
  <int name="graphics card">2</int> 
  </lst>
  </lst>
  </lst>
  <lst name="facet_dates" /> 
  <lst name="facet_ranges" /> 
  </lst>
  </response>
{code}

In Json:

{code}
{"responseHeader":{"status":0,"QTime":0,"params":{"facet.numfacetterms":"2","facet":"true","q":"*:*","facet.limit":"4","facet.field":"cat","wt":"json","rows":"0"}},"response":{"numFound":17,"start":0,"docs":[]},"facet_counts":{"facet_queries":{},"facet_fields":{"cat":["numFacetTerms",14,"counts",["electronics",14,"memory",3,"connector",2,"graphics card",2]]},"facet_dates":{},"facet_ranges":{}}}

{code}
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch, SOLR-2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174441#comment-13174441 ] 

Erick Erickson edited comment on SOLR-2242 at 12/21/11 9:50 PM:
----------------------------------------------------------------

First step in resurrecting this. This patch should apply cleanly to trunk. It incorporates the SOLR-2242.patch from 28-June and the NumFacetTermsFacetsTest from 9-July. It accounts for the fact that things seem to have been moved around a bit.
                
      was (Author: erickerickson):
    First step in resurrecting this. This patch should apply cleanly to trunk. It incorporates the SOLR-2242.patch from 28-June and the NmFacetTermsFacetsTest from 9-July. It accounts for the fact that things seem to have been moved around a bit.
                  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson reassigned SOLR-2242:
------------------------------------

    Assignee:     (was: Erick Erickson)

I won't get to this for 3.6
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174152#comment-13174152 ] 

Erick Erickson edited comment on SOLR-2242 at 12/21/11 3:45 PM:
----------------------------------------------------------------

OK, it seems like we have several themes here. I'd like to get a reasonable consensus before going forward... I'll put out a straw-man proposal here and we can go from there.

But lets figure out where we're going before revamping stuff yet again.

1> Distributed support. I sure don't see a good way to support this currently. Perhaps some of the future enhancements will make this easier (thinking distributed TF/IDF & such while being totally ignorant of that code), but returning the entire list of constraints (or names or terms or whatever we call it) is just a bad idea. The first time someone tries this on a field with 1,000,000 terms (yes, I've seen this) it'll just blow things up. I'm also slightly anti the min/max idea. I'm not sure what value there is in telling someone "there are between 10,000 and 90,000 distinct values". And if it's a field with just a few pre-defined values, that information is already known anyway.... But if someone can show a use-case here I'm not completely against it. But I'd like to see the use case first, not "someone might find it useful" <G>.

2> back compat. Cody's suggestion seems to be the slickest in terms of not breaking things, but we use attributes in just a few places, are there reasons NOT to do it that way? Or does this mess up JSON, PHP, etc?

3> Possibly add a new JIRA for changing the facet response format to be tolerant of sub-fields, but don't do that here.

Again, I want a clearly defined end point for the concerns raised before we dive back in here....


                
      was (Author: erickerickson):
    OK, it seems like we have several themes here. I'd like to get a reasonable consensus before going forward... I'll put out a straw-man proposal here and we can go from there.

But lets figure out where we're going before revamping stuff yet again.

1> Distributed support. I sure don't see a good way to support this currently. Perhaps some of the future enhancements will make this easier (thinking distributed TF/IDF & such while being totally ignorant of that code), but returning the entire list of constraints (or names or terms or whatever we call it) is just a bad idea. The first time someone tries this on a field with 1,000,000 terms (yes, I've seen this) it'll just blow things up. I'm also slightly anti the min/max idea. I'm not sure what value there is in telling someone "there are between 10,000 and 90,000 distinct values". And if it's a field with just a few pre-defined values, that information is already known anyway.... But if someone can show a use-case here I'm not completely against it. But I'd like to see the use case first, not "someone might find it useful" <G>.

2> back compat. Cody's suggestion seems to be the slickest in terms of not breaking things, but we use attributes in just a few places, are there reasons NOT to do it that way?

3> Possibly add a new JIRA for changing the facet response format to be tolerant of sub-fields, but don't do that here.

Again, I want a clearly defined end point for the concerns raised before we dive back in here....


                  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067500#comment-13067500 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Simon - thoughts?

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048964#comment-13048964 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Lance,

There is literally 15 lines of code changes. Not sure how you cannot follow it. I could use no memory and just loop through the results, but that would not be cached - so the speed would still be slow since I need to pull in the array in order to count it.

The field is not called namedistinct anymore... It is called facet.numFacetTerms=2,1,0.

All other parameters are good. Also you do not need anything else to get it to work, since I set the defaults to work for you now.

I'll see if I can write some more tests. Here is the rub: I would be happy to wrote hundreds of test cases if I knew someone was going to actually help me get this done. I am used to having a committer actually work with me - Mike McCandless is awesome and we worked on several issues together. But I have seen tons of features die when no one is willing to help. So here I am wanting, willing and able to get this done. And I have no one willing to assist from a committer perspective... The patch works fine in sharded and normal mode. So people can use it today. It is just not committed.

I have 4 clients using it in production and one has 100M page views a year, and so far no problems.

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price




> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072819#comment-13072819 ] 

Ryan McKinley commented on SOLR-2242:
-------------------------------------

Ya, always sending the whole seems like asking for problems.  You can control how many terms it should pass around with facet.limit, and we could potentially add a warning message to the resposne if that is less then the total number of terms.

Maybe we could also have facet.distrib.limit or something, that would bump up the number that it internally asks for, but still respect facet.limit for the final result?



> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054931#comment-13054931 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

re: whitespace

What are the settings supposed to be for tabs? Because on my editor it looks perfect. 4 space, tabs, 2 space per tab? ??

I will add some tests.

I think switching from if to switch and the movement to termList != null is mostly just style and does not really improve anything. I actually think it confuses things and makes the overall patch larger and more risky that we miss something or mess it up.

I will also look at the Integer generic... Thanks.

Bill


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Ethan Gruber (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191185#comment-13191185 ] 

Ethan Gruber commented on SOLR-2242:
------------------------------------

+1 for me too.  I have been using this feature for almost a year.  I plan to upgrade to the newest patch/Solr trunk code, but the patch doesn't apply to the current trunk.  Do I have to check out the revision that dates to 12/21/11 to get this to work?
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006792#comment-13006792 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 5:41 AM:
----------------------------------------------------------

I am going to use your suggestion. You will not have to set the limit or mincount. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=1

This assumes the count will be mincount=1, and limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=2

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}

      was (Author: billnbell):
    I am going to use your suggestion. You will not have to set the limit or mincount. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=1

This assumes the count will be mincount=1, and limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=2

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="hgid">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048466#comment-13048466 ] 

Mark Miller commented on SOLR-2242:
-----------------------------------

Hmm...yeah, fair amount of work went on here and a fair amount of interest... unfortunately, not my field (and I'm sick, on vacation, out of the country, and blah blah blah :) ). But, if no one takes this, I can get up to speed eventually - I doubt that soon though. Sorry Bill - not a lot of committers fluent in this area that are not very busy with other things.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053080#comment-13053080 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Simon,

I made all those changes except for the termsList one. I think it is useful to have the count based on terms.

See attachment.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064144#comment-13064144 ] 

Yonik Seeley commented on SOLR-2242:
------------------------------------

This issue was a bit tricky to review, given that the output doesn't seem to quite match the examples.
I also wasn't exactly sure what the latest patch was, so I just looked at the patch uploaded on 28/Jun/11.

Here's my summary on what the patch currently does:

If you add facet.facetTermCounts=2 to a faceting request, you get the following:

{code}
<lst name="facet_fields">
  <lst name="text">
    <int name="electronics">14</int>
    <int name="inc">8</int>
    <int name="2.0">5</int>
    <int name="lcd">5</int>
    <int name="memory">5</int>
    <int name="numFacetTerms">385</int>
  </lst>
</lst>
{code}

If you add facet.facetTermCounts=1 to a faceting request, you get the following:

{code}
<lst name="facet_fields">
  <lst name="text">
    <int name="numFacetTerms">385</int>
  </lst>
</lst>
{code}

w.r.t. the interface, I agree with a number of Lance's observations.

- facet.numFacetTerms name: the second "Facet" is a bit redundant.  And we probably should be talking in terms of "constraints" instead of "terms".  Perhaps facet.numConstraints or (facet.nconstraints to be consistent with group.ngroups).
- facet.nconstraints should just be a boolean... no need for "1" or "2".  If the user doesn't want to see any constraints, then they can set facet.limit=0.  This is also consistent with grouping.
- we're mixing units in the same list, and that's probably not a great idea?  Constraints have units of documents (number of documents that matched that constraint) while "numFacetTerms" has units of number of constraints.
- I think this also breaks distributed faceting due to mixing of units?  The distributed faceting code thinks that numFacetTerms is a constraint.
- We need to figure out what we are going to do in distributed mode... it doesn't seem easy to actually figure out the number of constraints without streaming them *all* back and merging (i.e. you can't just add up the numbers)
- I also agree that we should not built the entire list in memory just to get the size of that list.

It seems like rather than adding more magic names to the list (and risk a real collision with the actual name of a constraint), we should add more structure to the response, as previously discussed.

So if we added facet.nconstraints=true, we would get
{code}
<lst name="facet_fields">
  <lst name="text">
    <int name="numFacetTerms">385</int>
    <lst name="counts">
      <int name="electronics">14</int>
      <int name="inc">8</int>
      <int name="2.0">5</int>
      <int name="lcd">5</int>
      <int name="memory">5</int>
   </lst>
  </lst>
</lst>
{code}

And when we use this new format, we should consider using a separate "missing" name for facet.missing=true instead of using the null name in with the counts.

This format change is where we need to be careful about back compat - this interface is one of the widest used and with all the 3rd party clients and libraries out there, we should still support the old format via a facet.format parameter or something.

Bill: You originally opened this issue for use with grouping to get the total number of groups. Are you aware of the group.ngroups parameter that was added that does this?


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026103#comment-13026103 ] 

Bill Bell edited comment on SOLR-2242 at 4/28/11 3:51 AM:
----------------------------------------------------------

Lance Norskog,

What do you want it to be called? I would use a committer to take this issue on. It has several votes, and lots of downloads. People are using it successfully already.

Do you want me to switch the numFacetTerms to numFacetNames ? Anything else? I feel like we are going in circles on this issue.

{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numFacetNames=2

<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetNames">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>

{code}

      was (Author: billnbell):
    Lance Norskog,

What do you want it to be called? I would use a committer to take this issue on. It has several votes, and lots of downloads. People are using it successfully already.

Do you want me to switch the numFacetTerms to numFacetNames ? Anything else? I feel like we are going in circles on this issue.

{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numFacetTerms=2

<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="counts">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>

{code}
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049014#comment-13049014 ] 

Simon Willnauer commented on SOLR-2242:
---------------------------------------

Bill, this seems like an important issue. Many votes etc. I am on travel right now so give me some days to come back and I will work with you to get this done.
Thanks for your patience

simon

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026828#comment-13026828 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

It would be good to be able to cache the value, instead of building a list that is cached too.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242-3x_2.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_3.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Description: 
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1

Here is an example on field "hgid" (without namedistinct):

{code}
- <lst name="facet_fields">
- <lst name="hgid">
  <int name="HGPY0000045FD36D4000A">1</int> 
  <int name="HGPY00000FBC6690453A9">1</int> 
  <int name="HGPY00001E44ED6C4FB3B">1</int> 
  <int name="HGPY00001FA631034A1B8">1</int> 
  <int name="HGPY00003317ABAC43B48">1</int> 
  <int name="HGPY00003A17B2294CB5A">5</int> 
  <int name="HGPY00003ADD2B3D48C39">1</int> 
  </lst>
  </lst>
{code}

With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).

{code}
- <lst name="facet_fields">
- <lst name="hgid">
  <int name="_count_">7</int> 
  </lst>
  </lst>
{code}
This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

  was:
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1

Here is an example on field "hgid" (without namedistinct):

- <lst name="facet_fields">
- <lst name="hgid">
  <int name="HGPY0000045FD36D4000A">1</int> 
  <int name="HGPY00000FBC6690453A9">1</int> 
  <int name="HGPY00001E44ED6C4FB3B">1</int> 
  <int name="HGPY00001FA631034A1B8">1</int> 
  <int name="HGPY00003317ABAC43B48">1</int> 
  <int name="HGPY00003A17B2294CB5A">5</int> 
  <int name="HGPY00003ADD2B3D48C39">1</int> 
  </lst>
  </lst>

With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).

- <lst name="facet_fields">
- <lst name="hgid">
  <int name="_count_">7</int> 
  </lst>
  </lst>

This works actually really good to get total number of fields for a group.field=hgid. Enjoy!


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240158#comment-13240158 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Yonik agreed.  However what is the alternative. We are talking distinct terms, and unless I limit the number of terms there could be a performance issue on using this with sharding. Since I would need to sent the terms and combine them and look for uniques. I am willing to do that work (not that much coding - more worried about CPU and network performance). The one I submitted does change the format by ADDING a new section. It shouldn't break other facets (usually adding sections to the JSON/XML output should not be a hard break). The latest version does not change the facet_field section so it is compatible.

I am working on getting the tests to work. Most seem trivial fixes and not more serious. Since we changed the format...

However, several people would like to use this. If I fix the test cases that are breaking can we consider a commit?


                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Jonathan Rochkind (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006432#comment-13006432 ] 

Jonathan Rochkind commented on SOLR-2242:
-----------------------------------------

I would love to see this feature in trunk, I could really use it. 

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239544#comment-13239544 ] 

Yonik Seeley commented on SOLR-2242:
------------------------------------

There are other JIRA issues open for adding more facet-related data as well, and adding a new section for each doesn't seem desirable.
I think I'm still in favor of biting the bullet and changing the facet response format for 4.0, while having some sort of flag to enable the older format for back compat.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239441#comment-13239441 ] 

Erick Erickson commented on SOLR-2242:
--------------------------------------

Bill:

Tests do not pass on either 3.x or trunk with this patch.
some 3.x failures:

ant test -Dtestcase=TestDistributedSearch
ant test -Dtestcase=testGroupingGroupedBasedFaceting
ant test -Dtestcase=TestDistributedGrouping

some 4x failures:
ant test -Dtestcase=BasicDistributedZkTest
ant test -Dtestcase=TestGroupingSearch

I'm not sure whether these are test problems or more serious...
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071162#comment-13071162 ] 

Ryan McKinley commented on SOLR-2242:
-------------------------------------

bq. Perhaps we should return the maximum and sum of all shard counts? That way, assuming the client knew how many shards exist, they could handle most scenarios.

Once we change the output format, we should be able to add a few thigns to the output.  Perhaps something like
{code:xml}
<lst name="text">
    <int name="numTerms">385</int>
    <int name="distribMaxTerms">385</int>
    <int name="distribSumTerms">845</int>
    <lst name="counts">
      ...
{code}

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-solr40-3.patch

Fixed order of facet_numTerms and fixed the getShard call to be consistent with SOLR 3.5

I think this is ready...
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40-2.patch, SOLR-2242-solr40-3.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007345#comment-13007345 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

OK I did the required work, can we get more feedback or get it committed? What else is needed?

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Description: 
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1

Here is an example on field "hgid" (without namedistinct):

- <lst name="facet_fields">
- <lst name="hgid">
  <int name="HGPY0000045FD36D4000A">1</int> 
  <int name="HGPY00000FBC6690453A9">1</int> 
  <int name="HGPY00001E44ED6C4FB3B">1</int> 
  <int name="HGPY00001FA631034A1B8">1</int> 
  <int name="HGPY00003317ABAC43B48">1</int> 
  <int name="HGPY00003A17B2294CB5A">5</int> 
  <int name="HGPY00003ADD2B3D48C39">1</int> 
  </lst>
  </lst>

With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).

- <lst name="facet_fields">
- <lst name="hgid">
  <int name="_count_">7</int> 
  </lst>
  </lst>

This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

  was:
See SOLR-236.

Need ability to get "count" back for the unique facets for grouping (field collapsing) instead of returning the facets. 




> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-solr40.patch

SOLR 4.0 TRUNK version.
                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064414#comment-13064414 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Yonik,

Yes I know about groups.ngroups. But the use case still stands. We need a way to add up facet terms without actually counting them.

I had the restructured facet_fields XML like you recommended (twice). And the issue is it breaks ALL sharding. The reason why it breaks distribution is that it is looking for <int> and not <lst>... Several people have wanted me to change the name to count, to term, to distinct... I really don't care what the name is, since it makes sense when you try it. I think changing the distribution is a MUCH larger project. If you want to jump in on the sharding/distribution to make it work with lists, then please help. The format change is a HUGE issue. The magic names could also be an issue but ONLY if you use this new feature. It is not an issue for all APIs and usage - which is why I added it as a magic variable.

Do we have any examples with Boolean? I have not seen any... Do we use True/False or on/off? Do you mean like facet=true ? The reason why I have a 1 and 2 is to get the count of terms, but only return a smaller set (internal limit=-1, but user types limit=5). That is the reason for that. I believe it is very useful.

Having the numFacetTerms like every other term pretty much works with sharding/distribution. It just adds it together like any other facet count. One server returns 5, and the other returns numFacetTerms=10, and the combined result returns 15. It may break some new feature with distribution or something I am not aware of and not using...

Concerning building in memory. Having it cached is what I was trying to achieve. If there is another way to cache the result then let me know other options. Not having it cached at all is a huge performance problem. If you are using mode 2, it does not matter that much since you need to return the list and in most cases you have it in memory... Mode 1 hides it a bit and builds the entire list in memory when we only need to cache the one value... Again - without breaking something else, not sure how to achieve that.

As long as there are not more gotchas in distribution, most of the other things you are listing (XML, name change, boolean) are almost preferences and the XML format change will be a huge issue, and we should be able to commit? Also, would like to not cache the entire list in memory when using this - need some assistance. 

1. Any other distribution/sharing issues with adding a magic variable in facet_field for a new feature? 
2. Where and how do we store a cache value without using the array that is present so we don't cache the whole facet term list when we only need to cache the resulting number?

Thanks.




> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Jonathan Rochkind (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026297#comment-13026297 ] 

Jonathan Rochkind commented on SOLR-2242:
-----------------------------------------

Wonderful much better, thanks Lance, this is a much more clear and flexible api consistent with other parts of Solr. (For a feature I could definitely really use, thanks Bill). 

But I wonder... should it be facet.numTerms to group with other facetting related params? Or wait, is it already?


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242.solr35.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_3.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008825#comment-13008825 ] 

Bill Bell edited comment on SOLR-2242 at 3/23/11 2:31 AM:
----------------------------------------------------------

Can someone look this patch over?

Also requested +1 from Isha Garg <is...@orkash.com>

Thanks,.

      was (Author: billnbell):
    Can someone loom this patch over?

Thanks,.
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048966#comment-13048966 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Thanks Mike.

I think it is committable since shards work now. We might need to fix some broken tests (and I am willing to do that).

Then we can move to range and queries...

Thanks. 

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240270#comment-13240270 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

All tests pass on branch_3x now. 


                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_4.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Description: 
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.


The feature is called "namedistinct". Here is an example:

Parameters:
facet.numTerms or f.<field>.facet.numTerms = true (default is false) - turn on distinct counting of terms

facet.field - the field to count the terms
It creates a new section in the facet section...

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=false&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price

This currently only works on facet.field.

{code}

<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">...</lst>
<lst name="facet_numTerms">
<lst name="localhost:8983/solr/">
<int name="price">14</int>
</lst>
<lst name="localhost:8080/solr/">
<int name="price">14</int>
</lst>
</lst>
<lst name="facet_dates"/>
<lst name="facet_ranges"/>
</lst>

OR with no sharding-

<lst name="facet_numTerms">
<int name="price">14</int>
</lst>

{code} 

Several people use this to get the group.field count (the # of groups).



  was:
When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.


The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price

This currently only works on facet.field.

{code}

<lst name="facet_fields">
  <lst name="price">
    <int name="numFacetTerms">14</int>
    <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
  </lst>
</lst>

{code} 

Several people use this to get the group.field count (the # of groups).



    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0-ALPHA
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.1
>
>         Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> Parameters:
> facet.numTerms or f.<field>.facet.numTerms = true (default is false) - turn on distinct counting of terms
> facet.field - the field to count the terms
> It creates a new section in the facet section...
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=false&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_counts">
> <lst name="facet_queries"/>
> <lst name="facet_fields">...</lst>
> <lst name="facet_numTerms">
> <lst name="localhost:8983/solr/">
> <int name="price">14</int>
> </lst>
> <lst name="localhost:8080/solr/">
> <int name="price">14</int>
> </lst>
> </lst>
> <lst name="facet_dates"/>
> <lst name="facet_ranges"/>
> </lst>
> OR with no sharding-
> <lst name="facet_numTerms">
> <int name="price">14</int>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Erick Erickson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174152#comment-13174152 ] 

Erick Erickson commented on SOLR-2242:
--------------------------------------

OK, it seems like we have several themes here. I'd like to get a reasonable consensus before going forward... I'll put out a straw-man proposal here and we can go from there.

But lets figure out where we're going before revamping stuff yet again.

1> Distributed support. I sure don't see a good way to support this currently. Perhaps some of the future enhancements will make this easier (thinking distributed TF/IDF & such while being totally ignorant of that code), but returning the entire list of constraints (or names or terms or whatever we call it) is just a bad idea. The first time someone tries this on a field with 1,000,000 terms (yes, I've seen this) it'll just blow things up. I'm also slightly anti the min/max idea. I'm not sure what value there is in telling someone "there are between 10,000 and 90,000 distinct values". And if it's a field with just a few pre-defined values, that information is already known anyway.... But if someone can show a use-case here I'm not completely against it. But I'd like to see the use case first, not "someone might find it useful" <G>.

2> back compat. Cody's suggestion seems to be the slickest in terms of not breaking things, but we use attributes in just a few places, are there reasons NOT to do it that way?

3> Possibly add a new JIRA for changing the facet response format to be tolerant of sub-fields, but don't do that here.

Again, I want a clearly defined end point for the concerns raised before we dive back in here....


                
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242-notworkingtest.patch

The test case gives an error. Not familiar with this error

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073334#comment-13073334 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

OK, I like the warning message idea. Also, it does depend on the shard approach since some shard by date... In that many cases the maxTerms would do what I need.

List:

1. Change the facet.field format.
2. Get it working with sharding.
3. Change code to cache the numFacetTerms/numTerms and remove the code that caches the huge term list.

I can do all of this except would like some help with #3.

Bill


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031539#comment-13031539 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

OK. Can you point me in the right direction. Are you a committer? Can we get this committed?


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment:     (was: SOLR-2242-solr40.patch)
    
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Jonathan Rochkind (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061788#comment-13061788 ] 

Jonathan Rochkind commented on SOLR-2242:
-----------------------------------------


I am out of the office on vacation, I will return Monday July 11. I will not be checking email.

For urgent Systems Department business, please contact Mercy Anaba, manaba@jhu.edu,        (410) 516-5306.


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Bell updated SOLR-2242:
----------------------------

    Attachment: SOLR-2242.shard.withtests.patch

I left the group in there, we can uncomment when it starts working again (if it does).


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006792#comment-13006792 ] 

Bill Bell edited comment on SOLR-2242 at 3/15/11 5:04 AM:
----------------------------------------------------------

I am going to use your suggestion. You will not have to set the limit or mincount. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=1

This assumes the count will be mincount=1, and limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=2

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="hgid">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}

      was (Author: billnbell):
    I am going to use your suggestion. You will not have to set the limit or mincount. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment):

This will ONLY output the numFacetTerms (no hgid facet counts):
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=true

This assumes the count will be mincount=1, and limit=-1

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
  </lst>
</lst>
{code}

This will output the numFacetTerms AND hgid:
http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=both

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="numFacetTerms">7</int>  <!-- this is not 11 -->
   <lst name="hgid">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}
  
> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006756#comment-13006756 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Thanks.

Not sure how to get the facet distinct count without looping, but I'll
look into that. Not sure what "constraints" means?

I agree that you should not have to specify limit, but mincount should
apply, since many times I want 1 or higher.

Would be always include this or just add it as an option?

f.hgid.facet.namedistinct=1 ?

Proposed:
{code}
"facet fields" : {"hgid" : {
  "missing" : 25,
  "namedistinct" : 1250,
  "counts" : ["constraint",10,...]
}}
{code}


Then we add others as needed?

Or do you mean?

f.hgid.facet.constraints = namedistinct() with the option to specify more
than one?

f.hgid.facet.constraints = namedistinct(),missing()


Proposed:
{code}
"facet fields" : {"hgid" : {
  "constraints" : ["missing()",25,"namedistinct()",1250],
  "counts" : ["constraint",10,...]
}}
{code}


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008825#comment-13008825 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Can someone loom this patch over?

Thanks,.

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006779#comment-13006779 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

No actually namedistinct is not the number of values. It is the number of names.

{code}
- <lst name="facet_fields">
- <lst name="hgid">
   <int name="HGPY0000045FD36D4000A">1</int>
   <int name="HGPY00000FBC6690453A9">1</int>
   <int name="HGPY00001E44ED6C4FB3B">1</int>
   <int name="HGPY00001FA631034A1B8">1</int>
   <int name="HGPY00003317ABAC43B48">1</int>
   <int name="HGPY00003A17B2294CB5A">5</int>
   <int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
   </lst>
{code}

Becomes:

{code}
<lst name="facet_fields">
  <lst name="hgid">
   <int name="namedistinct">7</int>  <!-- this is not 11 -->
   <lst name="hgid">
   	<int name="HGPY0000045FD36D4000A">1</int>
   	<int name="HGPY00000FBC6690453A9">1</int>
   	<int name="HGPY00001E44ED6C4FB3B">1</int>
   	<int name="HGPY00001FA631034A1B8">1</int>
   	<int name="HGPY00003317ABAC43B48">1</int>
   	<int name="HGPY00003A17B2294CB5A">5</int>
   	<int name="HGPY00003ADD2B3D48C39">1</int>
   </lst>
  </lst>
</lst>
{code}


> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006805#comment-13006805 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

OK this is complete.

Sample query:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=cat&rows=0&facet.numfacetterms=2&facet.limit=4

Sample output:
{code}
  <?xml version="1.0" encoding="UTF-8" ?> 
- <response>
- <lst name="responseHeader">
  <int name="status">0</int> 
  <int name="QTime">0</int> 
- <lst name="params">
  <str name="facet.numfacetterms">2</str> 
  <str name="facet">true</str> 
  <str name="q">*:*</str> 
  <str name="facet.limit">4</str> 
  <str name="facet.field">cat</str> 
  <str name="rows">0</str> 
  </lst>
  </lst>
  <result name="response" numFound="17" start="0" /> 
- <lst name="facet_counts">
  <lst name="facet_queries" /> 
- <lst name="facet_fields">
- <lst name="cat">
  <int name="numFacetTerms">14</int> 
- <lst name="counts">
  <int name="electronics">14</int> 
  <int name="memory">3</int> 
  <int name="connector">2</int> 
  <int name="graphics card">2</int> 
  </lst>
  </lst>
  </lst>
  <lst name="facet_dates" /> 
  <lst name="facet_ranges" /> 
  </lst>
  </response>
{code}

In Json:

{code}
{"responseHeader":{"status":0,"QTime":0,"params":{"facet.numfacetterms":"2","facet":"true","q":"*:*","facet.limit":"4","facet.field":"cat","wt":"json","rows":"0"}},"response":{"numFound":17,"start":0,"docs":[]},"facet_counts":{"facet_queries":{},"facet_fields":{"cat":["numFacetTerms",14,"counts",["electronics",14,"memory",3,"connector",2,"graphics card",2]]},"facet_dates":{},"facet_ranges":{}}}

{code}

> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242-distinctFacet.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044727#comment-13044727 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Since we changed the output of the facet_fields, the FacetComponent.java needs to change. This also impacts the DistribFieldFacet type. This code is not going to work, since price does not just have a list of numbers. It now has multiple lists (if we set the param). We might want to always return "counts" list in all cases. Then sharding can easily pick up on this... The DistribFieldFacet needs to be refactored.

{code}
<lst name="facet_fields">
  <lst name="price">
    <int name="numFacetTerms">14</int>
    <lst name="counts"><int name="0.0">3</int><int name="11.5">1</int><int name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
    </lst>
  </lst>
</lst>
{code}




> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="HGPY0000045FD36D4000A">1</int> 
>   <int name="HGPY00000FBC6690453A9">1</int> 
>   <int name="HGPY00001E44ED6C4FB3B">1</int> 
>   <int name="HGPY00001FA631034A1B8">1</int> 
>   <int name="HGPY00003317ABAC43B48">1</int> 
>   <int name="HGPY00003A17B2294CB5A">5</int> 
>   <int name="HGPY00003ADD2B3D48C39">1</int> 
>   </lst>
>   </lst>
> {code}
> With namedistinct (HGPY0000045FD36D4000A, HGPY00000FBC6690453A9, HGPY00001E44ED6C4FB3B, HGPY00001FA631034A1B8, HGPY00003317ABAC43B48, HGPY00003A17B2294CB5A, HGPY00003ADD2B3D48C39). This returns number of rows (7), not the number of values (11).
> {code}
> - <lst name="facet_fields">
> - <lst name="hgid">
>   <int name="_count_">7</int> 
>   </lst>
>   </lst>
> {code}
> This works actually really good to get total number of fields for a group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org