You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "David T. Webb" <da...@brightmove.com> on 2011/11/08 23:19:45 UTC

Search Correlated Data between Multivalued Fields

I have a normalized database schema that I have flattened out to create
a Solr schema.  My question is with regards to searching the multivalued
fields that are correlated from the sub-entity in the DataInputHandler.

 

Example

I have 2 tables CUSTOMER and NOTE

 

Customer can have one to many notes.

 

My data-config would look similar to this: (Not exact, just setting up
the question) J

 

<entity name="customer" query="select id, name, address, city from
customer">

  <entity name="note" query="Select note_id, note_date, note_text from
note where customer_id = ${customer.ID}"/>

</entity>

 

My schema would be something like this:

 

<field name="id" type="string" indexed="true" stored="true"
required="true" />

<field name="name" type="string" indexed="true" stored="true"
required="false" />

<field name="address" type="string" indexed="true" stored="true"
required="false" />

<field name="city" type="string" indexed="true" stored="true"
required="false" />

 

<field name="note_Id" type="long" indexed="false" stored="true"
required="false" multiValued="true" /> 

<field name="note_date" type="date" indexed="true" stored="true"
required="false" multiValued="true" />

<field name="note_text" type="text_general" indexed="true" stored="true"
required="false" multiValued="true" />

 

All is well, indexed and searchable. 

 

So, if there are 100 notes per customer at varying dates, how would I
query to essentially ask:

 

Give me all the Customers where note_text has "sales" AND the note_date
is between Date1 and Date2?

 

The multi-valued data is stored as arrays and the array positions line
up property. (i.e.  note_id[x], note_date[x], and note_Text[x] represent
an actual row that was loaded from the database.

 

Any suggestions on how to accomplish my problem?

 

Thank you!

 

--

Sincerely,

David Webb

 


Re: Search Correlated Data between Multivalued Fields

Posted by Andre Bois-Crettez <an...@kelkoo.com>.
Something like :

<entity name="customer_note" query="select id, name, address, city, note_id, note_date, note_text
  from customer,note where customer.id=customer_id"/>



David T. Webb wrote:
> Can you point me to the docs on how to create the additional flat index of note?  Thx for the quick reply. Dave. 
>
> Sent from my iPhone
>
> On Nov 9, 2011, at 6:03 AM, "Andre Bois-Crettez" <an...@kelkoo.com> wrote:
>
>   
>> I do not think this is possbile directly out of the box in Solr.
>>
>> A quick workaround would be to fully denormalize the data, ie instead of multivalued notes for a customer, have a completely flat index of customer_note.
>> Or maybe a custom request handler plugin could actually check that matches are for note_id[x], note_date[x], and note_Text[x] ? Not sure if this is doable.
>>
>> Andre
>>
>> David T. Webb wrote:
>>     
>>> I have a normalized database schema that I have flattened out to create
>>> a Solr schema.  My question is with regards to searching the multivalued
>>> fields that are correlated from the sub-entity in the DataInputHandler.
>>>
>>>
>>> Example
>>>
>>> I have 2 tables CUSTOMER and NOTE
>>>
>>>
>>> Customer can have one to many notes.
>>>
>>>
>>> My data-config would look similar to this: (Not exact, just setting up
>>> the question) J
>>>
>>>
>>> <entity name="customer" query="select id, name, address, city from
>>> customer">
>>>
>>>  <entity name="note" query="Select note_id, note_date, note_text from
>>> note where customer_id = ${customer.ID}"/>
>>>
>>> </entity>
>>>
>>>
>>> My schema would be something like this:
>>>
>>>
>>> <field name="id" type="string" indexed="true" stored="true"
>>> required="true" />
>>>
>>> <field name="name" type="string" indexed="true" stored="true"
>>> required="false" />
>>>
>>> <field name="address" type="string" indexed="true" stored="true"
>>> required="false" />
>>>
>>> <field name="city" type="string" indexed="true" stored="true"
>>> required="false" />
>>>
>>>
>>> <field name="note_Id" type="long" indexed="false" stored="true"
>>> required="false" multiValued="true" /> 
>>> <field name="note_date" type="date" indexed="true" stored="true"
>>> required="false" multiValued="true" />
>>>
>>> <field name="note_text" type="text_general" indexed="true" stored="true"
>>> required="false" multiValued="true" />
>>>
>>>
>>> All is well, indexed and searchable. 
>>>
>>> So, if there are 100 notes per customer at varying dates, how would I
>>> query to essentially ask:
>>>
>>>
>>> Give me all the Customers where note_text has "sales" AND the note_date
>>> is between Date1 and Date2?
>>>
>>>
>>> The multi-valued data is stored as arrays and the array positions line
>>> up property. (i.e.  note_id[x], note_date[x], and note_Text[x] represent
>>> an actual row that was loaded from the database.
>>>
>>>
>>> Any suggestions on how to accomplish my problem?
>>>
>>>
>>> Thank you!
>>>
>>>
>>> --
>>>
>>> Sincerely,
>>>
>>> David Webb
>>>
>>>
>>>
>>>  
>>>       
>> -- 
>> André Bois-Crettez
>>
>> Search technology, Kelkoo
>> http://www.kelkoo.com/
>>
>>     

-- 
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/


Re: Search Correlated Data between Multivalued Fields

Posted by "David T. Webb" <da...@brightmove.com>.
Can you point me to the docs on how to create the additional flat index of note?  Thx for the quick reply. Dave. 

Sent from my iPhone

On Nov 9, 2011, at 6:03 AM, "Andre Bois-Crettez" <an...@kelkoo.com> wrote:

> I do not think this is possbile directly out of the box in Solr.
> 
> A quick workaround would be to fully denormalize the data, ie instead of multivalued notes for a customer, have a completely flat index of customer_note.
> Or maybe a custom request handler plugin could actually check that matches are for note_id[x], note_date[x], and note_Text[x] ? Not sure if this is doable.
> 
> Andre
> 
> David T. Webb wrote:
>> I have a normalized database schema that I have flattened out to create
>> a Solr schema.  My question is with regards to searching the multivalued
>> fields that are correlated from the sub-entity in the DataInputHandler.
>> 
>> 
>> Example
>> 
>> I have 2 tables CUSTOMER and NOTE
>> 
>> 
>> Customer can have one to many notes.
>> 
>> 
>> My data-config would look similar to this: (Not exact, just setting up
>> the question) J
>> 
>> 
>> <entity name="customer" query="select id, name, address, city from
>> customer">
>> 
>>  <entity name="note" query="Select note_id, note_date, note_text from
>> note where customer_id = ${customer.ID}"/>
>> 
>> </entity>
>> 
>> 
>> My schema would be something like this:
>> 
>> 
>> <field name="id" type="string" indexed="true" stored="true"
>> required="true" />
>> 
>> <field name="name" type="string" indexed="true" stored="true"
>> required="false" />
>> 
>> <field name="address" type="string" indexed="true" stored="true"
>> required="false" />
>> 
>> <field name="city" type="string" indexed="true" stored="true"
>> required="false" />
>> 
>> 
>> <field name="note_Id" type="long" indexed="false" stored="true"
>> required="false" multiValued="true" /> 
>> <field name="note_date" type="date" indexed="true" stored="true"
>> required="false" multiValued="true" />
>> 
>> <field name="note_text" type="text_general" indexed="true" stored="true"
>> required="false" multiValued="true" />
>> 
>> 
>> All is well, indexed and searchable. 
>> 
>> So, if there are 100 notes per customer at varying dates, how would I
>> query to essentially ask:
>> 
>> 
>> Give me all the Customers where note_text has "sales" AND the note_date
>> is between Date1 and Date2?
>> 
>> 
>> The multi-valued data is stored as arrays and the array positions line
>> up property. (i.e.  note_id[x], note_date[x], and note_Text[x] represent
>> an actual row that was loaded from the database.
>> 
>> 
>> Any suggestions on how to accomplish my problem?
>> 
>> 
>> Thank you!
>> 
>> 
>> --
>> 
>> Sincerely,
>> 
>> David Webb
>> 
>> 
>> 
>>  
> 
> -- 
> André Bois-Crettez
> 
> Search technology, Kelkoo
> http://www.kelkoo.com/
> 

Re: Search Correlated Data between Multivalued Fields

Posted by Andre Bois-Crettez <an...@kelkoo.com>.
I do not think this is possbile directly out of the box in Solr.

A quick workaround would be to fully denormalize the data, ie instead of 
multivalued notes for a customer, have a completely flat index of 
customer_note.
Or maybe a custom request handler plugin could actually check that 
matches are for note_id[x], note_date[x], and note_Text[x] ? Not sure if 
this is doable.

Andre

David T. Webb wrote:
> I have a normalized database schema that I have flattened out to create
> a Solr schema.  My question is with regards to searching the multivalued
> fields that are correlated from the sub-entity in the DataInputHandler.
>
>  
>
> Example
>
> I have 2 tables CUSTOMER and NOTE
>
>  
>
> Customer can have one to many notes.
>
>  
>
> My data-config would look similar to this: (Not exact, just setting up
> the question) J
>
>  
>
> <entity name="customer" query="select id, name, address, city from
> customer">
>
>   <entity name="note" query="Select note_id, note_date, note_text from
> note where customer_id = ${customer.ID}"/>
>
> </entity>
>
>  
>
> My schema would be something like this:
>
>  
>
> <field name="id" type="string" indexed="true" stored="true"
> required="true" />
>
> <field name="name" type="string" indexed="true" stored="true"
> required="false" />
>
> <field name="address" type="string" indexed="true" stored="true"
> required="false" />
>
> <field name="city" type="string" indexed="true" stored="true"
> required="false" />
>
>  
>
> <field name="note_Id" type="long" indexed="false" stored="true"
> required="false" multiValued="true" /> 
>
> <field name="note_date" type="date" indexed="true" stored="true"
> required="false" multiValued="true" />
>
> <field name="note_text" type="text_general" indexed="true" stored="true"
> required="false" multiValued="true" />
>
>  
>
> All is well, indexed and searchable. 
>
>  
>
> So, if there are 100 notes per customer at varying dates, how would I
> query to essentially ask:
>
>  
>
> Give me all the Customers where note_text has "sales" AND the note_date
> is between Date1 and Date2?
>
>  
>
> The multi-valued data is stored as arrays and the array positions line
> up property. (i.e.  note_id[x], note_date[x], and note_Text[x] represent
> an actual row that was loaded from the database.
>
>  
>
> Any suggestions on how to accomplish my problem?
>
>  
>
> Thank you!
>
>  
>
> --
>
> Sincerely,
>
> David Webb
>
>  
>
>
>   

-- 
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/