You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by William Bell <bi...@gmail.com> on 2014/02/03 16:48:56 UTC

Duplicate Facet.FIelds cause same results, should dedupe?

If we add :

facet.field=prac_spec_heir&facet.field=prac_spec_heir

we get it twice in the results. This breaks deserialization on wt=json
since you cannot have the same name twice....

Thoughts? Seems like a new bug in 4.6 ?


"facet.field": ["prac_spec_heir","all_proc_name_code","all_cond_name_code","
prac_spec_heir","{!ex=exgender}gender","{!ex=expayor}payor_code_name"],

-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076

Re: Duplicate Facet.FIelds cause same results, should dedupe?

Posted by Varun Thacker <va...@gmail.com>.
Hi William,

I doubt this is a bug.

I tried on Solr 4.5.1. Indexed documents using "java -jar post.jar *.xml"

This is the query that I fired -
http://localhost:8983/solr/collection1/select?q=*:*&wt=json&indent=true&rows=0&facet=true&facet.limit=1&facet.field=name&facet.field=name

And here is the response I got - https://gist.github.com/vthacker/8800846

The JSON spec says that the names don't necessarily be unique. See 2.2 on
http://www.ietf.org/rfc/rfc4627.txt

There is a similar problem when indexing documents in JSON with the
multiple "add" keys.

Not sure if it will be helpful but I found this for Jackson
deserialization - "FAIL_ON_READING_DUP_TREE_KEY" (
https://github.com/FasterXML/jackson-databind/wiki/Deserialization-Features)


On Tue, Feb 4, 2014 at 10:03 AM, William Bell <bi...@gmail.com> wrote:

> THis is in 4.6.1.
>
>
> On Mon, Feb 3, 2014 at 9:11 PM, Otis Gospodnetic <
> otis.gospodnetic@gmail.com
> > wrote:
>
> > Hi,
> >
> > Don't know if this is old or new problem, but it does feel like a bug to
> > me.
> >
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> > On Mon, Feb 3, 2014 at 10:48 AM, William Bell <bi...@gmail.com>
> wrote:
> >
> > > If we add :
> > >
> > > facet.field=prac_spec_heir&facet.field=prac_spec_heir
> > >
> > > we get it twice in the results. This breaks deserialization on wt=json
> > > since you cannot have the same name twice....
> > >
> > > Thoughts? Seems like a new bug in 4.6 ?
> > >
> > >
> > > "facet.field":
> > > ["prac_spec_heir","all_proc_name_code","all_cond_name_code","
> > > prac_spec_heir","{!ex=exgender}gender","{!ex=expayor}payor_code_name"],
> > >
> > > --
> > > Bill Bell
> > > billnbell@gmail.com
> > > cell 720-256-8076
> > >
> >
>
>
>
> --
> Bill Bell
> billnbell@gmail.com
> cell 720-256-8076
>



-- 


Regards,
Varun Thacker
http://www.vthacker.in/

Re: Duplicate Facet.FIelds cause same results, should dedupe?

Posted by William Bell <bi...@gmail.com>.
THis is in 4.6.1.


On Mon, Feb 3, 2014 at 9:11 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:

> Hi,
>
> Don't know if this is old or new problem, but it does feel like a bug to
> me.
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Mon, Feb 3, 2014 at 10:48 AM, William Bell <bi...@gmail.com> wrote:
>
> > If we add :
> >
> > facet.field=prac_spec_heir&facet.field=prac_spec_heir
> >
> > we get it twice in the results. This breaks deserialization on wt=json
> > since you cannot have the same name twice....
> >
> > Thoughts? Seems like a new bug in 4.6 ?
> >
> >
> > "facet.field":
> > ["prac_spec_heir","all_proc_name_code","all_cond_name_code","
> > prac_spec_heir","{!ex=exgender}gender","{!ex=expayor}payor_code_name"],
> >
> > --
> > Bill Bell
> > billnbell@gmail.com
> > cell 720-256-8076
> >
>



-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076

Re: Duplicate Facet.FIelds cause same results, should dedupe?

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,

Don't know if this is old or new problem, but it does feel like a bug to me.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Feb 3, 2014 at 10:48 AM, William Bell <bi...@gmail.com> wrote:

> If we add :
>
> facet.field=prac_spec_heir&facet.field=prac_spec_heir
>
> we get it twice in the results. This breaks deserialization on wt=json
> since you cannot have the same name twice....
>
> Thoughts? Seems like a new bug in 4.6 ?
>
>
> "facet.field":
> ["prac_spec_heir","all_proc_name_code","all_cond_name_code","
> prac_spec_heir","{!ex=exgender}gender","{!ex=expayor}payor_code_name"],
>
> --
> Bill Bell
> billnbell@gmail.com
> cell 720-256-8076
>

Re: Duplicate Facet.FIelds cause same results, should dedupe?

Posted by Chris Hostetter <ho...@fucit.org>.
: facet.field=prac_spec_heir&facet.field=prac_spec_heir
	...
: Thoughts? Seems like a new bug in 4.6 ?

Nope ... it's always been like that. We could concievably dedup, but 
that seems like unneccessary cycles in most cases -- if the client asks 
for redundent faceting, the client gets redundent faceting.

Even if someone proposed a patch to dedup as a way to "help" out wayward 
users who specify redudent faceting by mistake, we'd have to think 
reeaaaaaaly careful about how we go about it, since there are several 
usecases where people can explicitly ask for "duplicate" faceting 
(depending on your definition of "duplicate")  to get different things...

* same raw field name, but diff options and diff output keys...
facet.field={!key=bar ex=ex1}xxx&facet.field={!key=foo ex=ex2}xxx
...basic dedup on the underlying field names would break this

* diff raw field names, but same output key so client can lump them 
together...
facet.field={!key=foo ex=ex1}xxx&facet.field={!key=foo ex=ex2}yyy
...deduping on the output key would break this


: we get it twice in the results. This breaks deserialization on wt=json
: since you cannot have the same name twice....

as mentioned in another reply: this is completely valid and legal json -- 
it's just that some json parsers are broken (or have broken default 
behavior).


-Hoss
http://www.lucidworks.com/