You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by wojtekpia <wo...@hotmail.com> on 2009/08/26 21:58:02 UTC

Searching and Displaying Different Logical Entities

I'm trying to figure out if Solr is the right solution for a problem I'm
facing. I have 2 data entities: P(arent) & C(hild). P contains up to 100
instances of C. I need to expose an interface that searches attributes of
entity C, but displays them grouped by parent entity, P. I need to include
facet counts in the result, and the counts are based on P.

My first solution was to create 2 Solr instances: one for each entity. I
would have to execute 2 queries each time: 1) get a list of matching P's
based on a query of the C instance (facet by P ID in C instance to get
unique list of P's), then 2) get all P's by ID, including facet counts, etc.
The problem I face with this solution is that I can have many matching P's
(10,000+), so my second query will have many (10,000+) constraints. 

My second (and current) solution is to create a single instance, and flatten
all C attributes into the appropriate P record using dynamic fields. For
example, if C has an attribute CA, then I have a dynamic field in P called
CA*. I name this field incrementally based on the number of C's per P (CA1,
CA2, ...).  This works, except that each query is very long (CA1:condition
OR CA2: condition ...). 

Neither solution is ideal. I'm wondering if I'm missing something obvious,
or if I'm using the wrong solution for this problem.

Any insight is appreciated.

Wojtek
-- 
View this message in context: http://www.nabble.com/Searching-and-Displaying-Different-Logical-Entities-tp25156301p25156301.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Searching and Displaying Different Logical Entities

Posted by wojtekpia <wo...@hotmail.com>.

Funtick wrote:
> 
>>then 2) get all P's by ID, including facet counts, etc.
>>The problem I face with this solution is that I can have many matching P's
> (10,000+), so my second query will have many (10,000+) constraints.
> 
> SOLR can automatically provide you P's with Counts, and it will be
> _unique_...
> 
> 

I assume you mean to facet by P in the C index. My next problem is to sort
those P's based on some attribute of P (as opposed to alphabetically or by
occurrence in C).


Funtick wrote:
> 
> Even if cardinality of P is 10,000+ SOLR is very fast now (expect few
> seconds response time for initial request). You need single query with
> "faceting"...
> 

Is there a practical limit for maxBooleanClauses? The default is 1024, but I
need at least 10,000.


Funtick wrote:
> 
> (!) You do not need P's ID.
> 
> Single document will have unique ID, and fields such as P, C (with
> possible
> attributes). Do not think in terms of RDBMS... Lucene does all
> 'normalization' behind the scenes, and SOLR will give you Ps with Cs... 
> 

If I put both P's and C's into a single index, then I agree, I don't need
P's ID. If I have P and C in separate indices then I still need to maintain
the logical relationship between P and C. 

It wasn't clear to me if you suggested I continue with either of my 2
proposed solutions. Can you clarify?

Thanks,

Wojtek
-- 
View this message in context: http://www.nabble.com/Searching-and-Displaying-Different-Logical-Entities-tp25156301p25181664.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Searching and Displaying Different Logical Entities

Posted by Fuad Efendi <fu...@efendi.ca>.
>then 2) get all P's by ID, including facet counts, etc.
>The problem I face with this solution is that I can have many matching P's
(10,000+), so my second query will have many (10,000+) constraints.


SOLR can automatically provide you P's with Counts, and it will be
_unique_...

Even if cardinality of P is 10,000+ SOLR is very fast now (expect few
seconds response time for initial request). You need single query with
"faceting"...


(!) You do not need P's ID.

Single document will have unique ID, and fields such as P, C (with possible
attributes). Do not think in terms of RDBMS... Lucene does all
'normalization' behind the scenes, and SOLR will give you Ps with Cs... 



-----Original Message-----
From: wojtekpia [mailto:wojtek_p@hotmail.com] 
Sent: August-26-09 3:58 PM
To: solr-user@lucene.apache.org
Subject: Searching and Displaying Different Logical Entities


I'm trying to figure out if Solr is the right solution for a problem I'm
facing. I have 2 data entities: P(arent) & C(hild). P contains up to 100
instances of C. I need to expose an interface that searches attributes of
entity C, but displays them grouped by parent entity, P. I need to include
facet counts in the result, and the counts are based on P.

My first solution was to create 2 Solr instances: one for each entity. I
would have to execute 2 queries each time: 1) get a list of matching P's
based on a query of the C instance (facet by P ID in C instance to get
unique list of P's), then 2) get all P's by ID, including facet counts, etc.
The problem I face with this solution is that I can have many matching P's
(10,000+), so my second query will have many (10,000+) constraints. 

My second (and current) solution is to create a single instance, and flatten
all C attributes into the appropriate P record using dynamic fields. For
example, if C has an attribute CA, then I have a dynamic field in P called
CA*. I name this field incrementally based on the number of C's per P (CA1,
CA2, ...).  This works, except that each query is very long (CA1:condition
OR CA2: condition ...). 

Neither solution is ideal. I'm wondering if I'm missing something obvious,
or if I'm using the wrong solution for this problem.

Any insight is appreciated.

Wojtek
-- 
View this message in context:
http://www.nabble.com/Searching-and-Displaying-Different-Logical-Entities-tp
25156301p25156301.html
Sent from the Solr - User mailing list archive at Nabble.com.