You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by jisenhart <ji...@yoholla.com> on 2011/04/07 20:18:20 UTC

Queries with undetermined field count

I have a question on how to set up queries not having a predetermined 
field list to search on.

Here are some sample docs,
<doc>
    <str name="_id">1234</str>
    <str name="_fred"><str>hi</str><str>hello</str></str>
    <str name="_group3"><str>lala</str><str>chika chika boom 
boom</str></str>
</doc>
<doc>
    <str name="_id">1235</str>
    <str name="_group1"><str>foo</str><str>bar</str><str>happy happy 
joy joy</str></str>
    <str name="_group2"><str>some text</str><str>some more words to 
search</str></str>
</doc>
.
.
.
<doc>
    <str name="_id">4567</str>
    <str name="_wilma"><str>bed</str><str>rock</str></str>
    <str name="_group3"><str>meme</str><str>you you</str></str>
    <str name="_group52"><str>super duper</str><str>are we 
done?</str></str>
</doc>

Now a given user user, say fred, belongs to any number of groups, say 
only fred, and group1 for this example.
A query on 'foo' is easy if I know that fred belongs to only these 
two:

	_fred:foo OR _group1:foo //will find a hit on doc 1235

However, a user can belong to any number of groups. How do I perform 
such a search if the users group list is arbitrarily large?

Could I somehow make use of reference docs like so:

<doc>
    <str name="_id">fred</str>
    <str name="_groups"><str>fred</str><str>group1</str></str>
</doc>
.
.
.
<doc>
    <str name="_id">wilma</str>
    <str 
name="_groups"><str>wilma</str><str>group1</str><str>group5</str><str>group9</str><str>group11</str><str>group31</str><str>group40</str></str>
</doc>


Re: Queries with undetermined field count

Posted by Renaud Delbru <re...@deri.org>.
Hi,

SIREn [1], a Lucene/Solr plugin, allows you perform queries across an 
undetermined number of fields, even if you have hundred of thousands of 
fields. It might be helpful for your scenario.

[1] http://siren.sindice.com
-- 
Renaud Delbru

On 07/04/11 19:18, jisenhart wrote:
>
> I have a question on how to set up queries not having a predetermined
> field list to search on.
>
> Here are some sample docs,
> <doc>
> <str name="_id">1234</str>
> <str name="_fred"><str>hi</str><str>hello</str></str>
> <str name="_group3"><str>lala</str><str>chika chika boom boom</str></str>
> </doc>
> <doc>
> <str name="_id">1235</str>
> <str name="_group1"><str>foo</str><str>bar</str><str>happy happy joy
> joy</str></str>
> <str name="_group2"><str>some text</str><str>some more words to
> search</str></str>
> </doc>
> .
> .
> .
> <doc>
> <str name="_id">4567</str>
> <str name="_wilma"><str>bed</str><str>rock</str></str>
> <str name="_group3"><str>meme</str><str>you you</str></str>
> <str name="_group52"><str>super duper</str><str>are we done?</str></str>
> </doc>
>
> Now a given user user, say fred, belongs to any number of groups, say
> only fred, and group1 for this example.
> A query on 'foo' is easy if I know that fred belongs to only these two:
>
> _fred:foo OR _group1:foo //will find a hit on doc 1235
>
> However, a user can belong to any number of groups. How do I perform
> such a search if the users group list is arbitrarily large?
>
> Could I somehow make use of reference docs like so:
>
> <doc>
> <str name="_id">fred</str>
> <str name="_groups"><str>fred</str><str>group1</str></str>
> </doc>
> .
> .
> .
> <doc>
> <str name="_id">wilma</str>
> <str
> name="_groups"><str>wilma</str><str>group1</str><str>group5</str><str>group9</str><str>group11</str><str>group31</str><str>group40</str></str>
>
> </doc>
>


Re: Queries with undetermined field count

Posted by Erick Erickson <er...@gmail.com>.
One possibility is to have just a "groups" field with a positionIncrementGap
of, say, 100.
that is multiValued.

Now, index values like

"group1 foo bar happy joy joy"
"group2 some more words to search"
etc.

Now do phrase queries with a slop of less than 100. Then searches like
groups:"group1 more"~99 would not match because the gap is greater
than the slop.

Of course this works better if the values are single tokens and you can
index
values like
group1 foo
group1 bar
group1 happy
group1 joy
group2 some

with the same increment trick. In that case, the slop could just be, say, 2
and the
increment gap 10 or some such.

Best
Erick

On Thu, Apr 7, 2011 at 2:18 PM, jisenhart <ji...@yoholla.com> wrote:

>
> I have a question on how to set up queries not having a predetermined field
> list to search on.
>
> Here are some sample docs,
> <doc>
>   <str name="_id">1234</str>
>   <str name="_fred"><str>hi</str><str>hello</str></str>
>   <str name="_group3"><str>lala</str><str>chika chika boom boom</str></str>
> </doc>
> <doc>
>   <str name="_id">1235</str>
>   <str name="_group1"><str>foo</str><str>bar</str><str>happy happy joy
> joy</str></str>
>   <str name="_group2"><str>some text</str><str>some more words to
> search</str></str>
> </doc>
> .
> .
> .
> <doc>
>   <str name="_id">4567</str>
>   <str name="_wilma"><str>bed</str><str>rock</str></str>
>   <str name="_group3"><str>meme</str><str>you you</str></str>
>   <str name="_group52"><str>super duper</str><str>are we done?</str></str>
> </doc>
>
> Now a given user user, say fred, belongs to any number of groups, say only
> fred, and group1 for this example.
> A query on 'foo' is easy if I know that fred belongs to only these two:
>
>        _fred:foo OR _group1:foo //will find a hit on doc 1235
>
> However, a user can belong to any number of groups. How do I perform such a
> search if the users group list is arbitrarily large?
>
> Could I somehow make use of reference docs like so:
>
> <doc>
>   <str name="_id">fred</str>
>   <str name="_groups"><str>fred</str><str>group1</str></str>
> </doc>
> .
> .
> .
> <doc>
>   <str name="_id">wilma</str>
>   <str
> name="_groups"><str>wilma</str><str>group1</str><str>group5</str><str>group9</str><str>group11</str><str>group31</str><str>group40</str></str>
> </doc>
>
>