You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Derek Poh <dp...@globalsources.com> on 2017/04/27 05:25:17 UTC

1 main collection or multiple smaller collections?

Hi
I amplanning for a migration of a legacy searchengine to Solr.
Basically thedata can be categorisedinto suppliersinfo, suppliers 
products info and products category info. These sets of data are related 
to each other.
suppliers products data, which is the largest, have around 300,000 
records currentlyand projected to increase.

Should I put these data in 1 single collection or in separate 
collections - eg. 1 collection for suppliers info, 1 collection for 
suppliers products infoand 1 collection fo products categories info?
What should I consider and plan for when deciding which option to take?

Derek

----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: 1 main collection or multiple smaller collections?

Posted by Walter Underwood <wu...@wunderwood.org>.
Also, 300,000 documents is fairly small for Solr. We handle a million queries per day with a few servers on a collection that size.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 26, 2017, at 10:33 PM, Walter Underwood <wu...@wunderwood.org> wrote:
> 
> Do they have the same fields or different fields? Are they updated separately or together?
> 
> If they have the same fields and are updated together, I’d put them in the same collection. Otherwise, probably separate.
> 
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
> 
>> On Apr 26, 2017, at 10:25 PM, Derek Poh <dp...@globalsources.com> wrote:
>> 
>> Hi
>> I amplanning for a migration of a legacy searchengine to Solr.
>> Basically thedata can be categorisedinto suppliersinfo, suppliers products info and products category info. These sets of data are related to each other.
>> suppliers products data, which is the largest, have around 300,000 records currentlyand projected to increase.
>> 
>> Should I put these data in 1 single collection or in separate collections - eg. 1 collection for suppliers info, 1 collection for suppliers products infoand 1 collection fo products categories info?
>> What should I consider and plan for when deciding which option to take?
>> 
>> Derek
>> 
>> ----------------------
>> CONFIDENTIALITY NOTICE 
>> This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 
>> This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.
> 


Re: 1 main collection or multiple smaller collections?

Posted by Rick Leir <rl...@leirtech.com>.
Derek
You could have one document per supplier which has no product info. It would have a flag to indicate this. Then your supplier search is simple. 

But grouping would be better, so the supplier search can show product counts and categories and ...

+1 Walter on designing back from the results page. That is from the NoSQL playbook.
Cheers -- Rick

On April 27, 2017 10:34:56 PM EDT, Derek Poh <dp...@globalsources.com> wrote:
>Walter
>
>Thank you for sharing your use case. I will try to design backwards
>from 
>the search result pages.
>As of now user can either do a supplier search or a product.search.
>Using 1single collection of products documents, with supplier info in 
>each product document, for supplier search, I will need to use grouping
>
>result or collapse parser.
>
>On 4/28/2017 1:08 AM, Walter Underwood wrote:
>> Design backwards from the search result pages (SRP). Make flat
>schema(s) with the fields you will search and display.
>>
>> One example is the schema I used at Netflix. I used one collection to
>hold movies, people (actors), and genres. There were collisions between
>the integer IDs, movies IDs were prefixed with “m”, people with “p”,
>and genres with “g”. The searched fields were “title” and
>“description”. There was also a “type” field which was “movie”,
>“person”, or “genre”. There was a also a field for the database ID
>(without the prefix).
>>
>> A movie SRP used an “fq” filter of “type:movie”, and so on for other
>SRPs. There were a few other filters, like G-rated movies or streaming,
>DVD, HD DVD, or Bluray.
>>
>> The full index was under 350K documents.
>>
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>
>>> On Apr 27, 2017, at 10:01 AM, Rick Leir <rl...@leirtech.com> wrote:
>>>
>>> Does it make sense to use nested documents here? Products could be
>nested in a supplier document perhaps.
>>>
>>> Alternately, consider de-normalizing "til it hurts". A product doc
>might be able to contain supplier info.
>>>
>>> On April 27, 2017 8:50:59 AM EDT, Shawn Heisey <ap...@elyograg.org>
>wrote:
>>>> On 4/26/2017 11:57 PM, Derek Poh wrote:
>>>>> There are some common fields between them.
>>>>> At the source data end (database), the supplier info and product
>info
>>>>> are updated separately. In this regard, I should separate them?
>>>>> If it's In 1 single collection, when there are updatesto only the
>>>>> supplier info,the product info will be index again even though
>there
>>>>> is noupdates to them, Is my reasoning valid?
>>>>>
>>>>>
>>>>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>>>>> Do they have the same fields or different fields? Are they
>updated
>>>>>> separately or together?
>>>>>>
>>>>>> If they have the same fields and are updated together, I’d put
>them
>>>>>> in the same collection. Otherwise, probably separate.
>>>> Walter's statements are right on the money, you just might need a
>>>> little
>>>> more detail.
>>>>
>>>> There are are two critical details that decide whether you even CAN
>>>> combine different data in a single index: One is that all types of
>>>> records must use the same field (the uniqueKey field) to determine
>>>> uniqueness, and the value of this field must be unique across the
>>>> entire
>>>> dataset.  The other is that there SHOULD be a field with a name
>like
>>>> "type" that your search client can use to differentiate the
>different
>>>> kinds of documents.  This type field is not necessary, but it does
>make
>>>> things easier.
>>>>
>>>> Assuming you CAN combine documents, there is still the question of
>>>> whether you SHOULD.  If the fields that you will commonly search
>are
>>>> the
>>>> same between the different kinds of documents, and if people want
>to be
>>>> able to do one search and get more than one of the document types
>you
>>>> are indexing, then it is something you should consider.  If people
>will
>>>> only ever search one type of document, you should probably keep
>them in
>>>> separate indexes to keep things cleaner.
>>>>
>>>> Thanks,
>>>> Shawn
>>> -- 
>>> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>>
>
>
>----------------------
>CONFIDENTIALITY NOTICE 
>
>This e-mail (including any attachments) may contain confidential and/or
>privileged information. If you are not the intended recipient or have
>received this e-mail in error, please inform the sender immediately and
>delete this e-mail (including any attachments) from your computer, and
>you must not use, disclose to anyone else or copy this e-mail
>(including any attachments), whether in whole or in part. 
>
>This e-mail and any reply to it may be monitored for security, legal,
>regulatory compliance and/or other appropriate reasons.

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: 1 main collection or multiple smaller collections?

Posted by Derek Poh <dp...@globalsources.com>.
Walter

Thank you for sharing your use case. I will try to design backwards from 
the search result pages.
As of now user can either do a supplier search or a product.search.
Using 1single collection of products documents, with supplier info in 
each product document, for supplier search, I will need to use grouping 
result or collapse parser.

On 4/28/2017 1:08 AM, Walter Underwood wrote:
> Design backwards from the search result pages (SRP). Make flat schema(s) with the fields you will search and display.
>
> One example is the schema I used at Netflix. I used one collection to hold movies, people (actors), and genres. There were collisions between the integer IDs, movies IDs were prefixed with \u201cm\u201d, people with \u201cp\u201d, and genres with \u201cg\u201d. The searched fields were \u201ctitle\u201d and \u201cdescription\u201d. There was also a \u201ctype\u201d field which was \u201cmovie\u201d, \u201cperson\u201d, or \u201cgenre\u201d. There was a also a field for the database ID (without the prefix).
>
> A movie SRP used an \u201cfq\u201d filter of \u201ctype:movie\u201d, and so on for other SRPs. There were a few other filters, like G-rated movies or streaming, DVD, HD DVD, or Bluray.
>
> The full index was under 350K documents.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>> On Apr 27, 2017, at 10:01 AM, Rick Leir <rl...@leirtech.com> wrote:
>>
>> Does it make sense to use nested documents here? Products could be nested in a supplier document perhaps.
>>
>> Alternately, consider de-normalizing "til it hurts". A product doc might be able to contain supplier info.
>>
>> On April 27, 2017 8:50:59 AM EDT, Shawn Heisey <ap...@elyograg.org> wrote:
>>> On 4/26/2017 11:57 PM, Derek Poh wrote:
>>>> There are some common fields between them.
>>>> At the source data end (database), the supplier info and product info
>>>> are updated separately. In this regard, I should separate them?
>>>> If it's In 1 single collection, when there are updatesto only the
>>>> supplier info,the product info will be index again even though there
>>>> is noupdates to them, Is my reasoning valid?
>>>>
>>>>
>>>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>>>> Do they have the same fields or different fields? Are they updated
>>>>> separately or together?
>>>>>
>>>>> If they have the same fields and are updated together, I\u2019d put them
>>>>> in the same collection. Otherwise, probably separate.
>>> Walter's statements are right on the money, you just might need a
>>> little
>>> more detail.
>>>
>>> There are are two critical details that decide whether you even CAN
>>> combine different data in a single index: One is that all types of
>>> records must use the same field (the uniqueKey field) to determine
>>> uniqueness, and the value of this field must be unique across the
>>> entire
>>> dataset.  The other is that there SHOULD be a field with a name like
>>> "type" that your search client can use to differentiate the different
>>> kinds of documents.  This type field is not necessary, but it does make
>>> things easier.
>>>
>>> Assuming you CAN combine documents, there is still the question of
>>> whether you SHOULD.  If the fields that you will commonly search are
>>> the
>>> same between the different kinds of documents, and if people want to be
>>> able to do one search and get more than one of the document types you
>>> are indexing, then it is something you should consider.  If people will
>>> only ever search one type of document, you should probably keep them in
>>> separate indexes to keep things cleaner.
>>>
>>> Thanks,
>>> Shawn
>> -- 
>> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: 1 main collection or multiple smaller collections?

Posted by Walter Underwood <wu...@wunderwood.org>.
Design backwards from the search result pages (SRP). Make flat schema(s) with the fields you will search and display.

One example is the schema I used at Netflix. I used one collection to hold movies, people (actors), and genres. There were collisions between the integer IDs, movies IDs were prefixed with “m”, people with “p”, and genres with “g”. The searched fields were “title” and “description”. There was also a “type” field which was “movie”, “person”, or “genre”. There was a also a field for the database ID (without the prefix).

A movie SRP used an “fq” filter of “type:movie”, and so on for other SRPs. There were a few other filters, like G-rated movies or streaming, DVD, HD DVD, or Bluray.

The full index was under 350K documents.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 27, 2017, at 10:01 AM, Rick Leir <rl...@leirtech.com> wrote:
> 
> Does it make sense to use nested documents here? Products could be nested in a supplier document perhaps.
> 
> Alternately, consider de-normalizing "til it hurts". A product doc might be able to contain supplier info.
> 
> On April 27, 2017 8:50:59 AM EDT, Shawn Heisey <ap...@elyograg.org> wrote:
>> On 4/26/2017 11:57 PM, Derek Poh wrote:
>>> There are some common fields between them.
>>> At the source data end (database), the supplier info and product info
>>> are updated separately. In this regard, I should separate them?
>>> If it's In 1 single collection, when there are updatesto only the
>>> supplier info,the product info will be index again even though there
>>> is noupdates to them, Is my reasoning valid?
>>> 
>>> 
>>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>>> Do they have the same fields or different fields? Are they updated
>>>> separately or together?
>>>> 
>>>> If they have the same fields and are updated together, I’d put them
>>>> in the same collection. Otherwise, probably separate. 
>> 
>> Walter's statements are right on the money, you just might need a
>> little
>> more detail.
>> 
>> There are are two critical details that decide whether you even CAN
>> combine different data in a single index: One is that all types of
>> records must use the same field (the uniqueKey field) to determine
>> uniqueness, and the value of this field must be unique across the
>> entire
>> dataset.  The other is that there SHOULD be a field with a name like
>> "type" that your search client can use to differentiate the different
>> kinds of documents.  This type field is not necessary, but it does make
>> things easier.
>> 
>> Assuming you CAN combine documents, there is still the question of
>> whether you SHOULD.  If the fields that you will commonly search are
>> the
>> same between the different kinds of documents, and if people want to be
>> able to do one search and get more than one of the document types you
>> are indexing, then it is something you should consider.  If people will
>> only ever search one type of document, you should probably keep them in
>> separate indexes to keep things cleaner.
>> 
>> Thanks,
>> Shawn
> 
> -- 
> Sorry for being brief. Alternate email is rickleir at yahoo dot com


Re: 1 main collection or multiple smaller collections?

Posted by Derek Poh <dp...@globalsources.com>.
Richard

Iam considering the sameoption asyour suggestion to put them in 1 single 
collection of products documents. A product doccontaining the supplier info.
In this option, a supplier info will get repeated in eachof the 
supplier's product doc.I may be influenced by DB concepts. Guess it's a 
trade off for this option.

On 4/28/2017 1:01 AM, Rick Leir wrote:
> Does it make sense to use nested documents here? Products could be nested in a supplier document perhaps.
>
> Alternately, consider de-normalizing "til it hurts". A product doc might be able to contain supplier info.
>
> On April 27, 2017 8:50:59 AM EDT, Shawn Heisey <ap...@elyograg.org> wrote:
>> On 4/26/2017 11:57 PM, Derek Poh wrote:
>>> There are some common fields between them.
>>> At the source data end (database), the supplier info and product info
>>> are updated separately. In this regard, I should separate them?
>>> If it's In 1 single collection, when there are updatesto only the
>>> supplier info,the product info will be index again even though there
>>> is noupdates to them, Is my reasoning valid?
>>>
>>>
>>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>>> Do they have the same fields or different fields? Are they updated
>>>> separately or together?
>>>>
>>>> If they have the same fields and are updated together, I\u2019d put them
>>>> in the same collection. Otherwise, probably separate.
>> Walter's statements are right on the money, you just might need a
>> little
>> more detail.
>>
>> There are are two critical details that decide whether you even CAN
>> combine different data in a single index: One is that all types of
>> records must use the same field (the uniqueKey field) to determine
>> uniqueness, and the value of this field must be unique across the
>> entire
>> dataset.  The other is that there SHOULD be a field with a name like
>> "type" that your search client can use to differentiate the different
>> kinds of documents.  This type field is not necessary, but it does make
>> things easier.
>>
>> Assuming you CAN combine documents, there is still the question of
>> whether you SHOULD.  If the fields that you will commonly search are
>> the
>> same between the different kinds of documents, and if people want to be
>> able to do one search and get more than one of the document types you
>> are indexing, then it is something you should consider.  If people will
>> only ever search one type of document, you should probably keep them in
>> separate indexes to keep things cleaner.
>>
>> Thanks,
>> Shawn


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: 1 main collection or multiple smaller collections?

Posted by Rick Leir <rl...@leirtech.com>.
Does it make sense to use nested documents here? Products could be nested in a supplier document perhaps.

Alternately, consider de-normalizing "til it hurts". A product doc might be able to contain supplier info.

On April 27, 2017 8:50:59 AM EDT, Shawn Heisey <ap...@elyograg.org> wrote:
>On 4/26/2017 11:57 PM, Derek Poh wrote:
>> There are some common fields between them.
>> At the source data end (database), the supplier info and product info
>> are updated separately. In this regard, I should separate them?
>> If it's In 1 single collection, when there are updatesto only the
>> supplier info,the product info will be index again even though there
>> is noupdates to them, Is my reasoning valid?
>>
>>
>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>> Do they have the same fields or different fields? Are they updated
>>> separately or together?
>>>
>>> If they have the same fields and are updated together, I’d put them
>>> in the same collection. Otherwise, probably separate. 
>
>Walter's statements are right on the money, you just might need a
>little
>more detail.
>
>There are are two critical details that decide whether you even CAN
>combine different data in a single index: One is that all types of
>records must use the same field (the uniqueKey field) to determine
>uniqueness, and the value of this field must be unique across the
>entire
>dataset.  The other is that there SHOULD be a field with a name like
>"type" that your search client can use to differentiate the different
>kinds of documents.  This type field is not necessary, but it does make
>things easier.
>
>Assuming you CAN combine documents, there is still the question of
>whether you SHOULD.  If the fields that you will commonly search are
>the
>same between the different kinds of documents, and if people want to be
>able to do one search and get more than one of the document types you
>are indexing, then it is something you should consider.  If people will
>only ever search one type of document, you should probably keep them in
>separate indexes to keep things cleaner.
>
>Thanks,
>Shawn

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: 1 main collection or multiple smaller collections?

Posted by Derek Poh <dp...@globalsources.com>.
Hi Shawn

1 set of data is suppliers info and 1 set isthe suppliers products info.
Usercan eitherdo a product search or a supplier search.

1 optionI am thinking of is to put them in 1 single collectionwith each 
product as a document. Each productdocument will have the supplier info 
in it.
Product id will be the uniquekey field.
With thisoption, the same supplier infowill be in every product document 
of the supplier.

A simplified example:
doc:
product id: P1
product description: XXX
supplier id: S1
supplier name: XXX
suppiler address: XXX

doc:
product id: P2
product description: XXXYYY
supplier id: S1
supplier name: XXX
supplier address: XXX

I may be influenced by DB concepts. Is such a design logical?


On 4/27/2017 8:50 PM, Shawn Heisey wrote:
> On 4/26/2017 11:57 PM, Derek Poh wrote:
>> There are some common fields between them.
>> At the source data end (database), the supplier info and product info
>> are updated separately. In this regard, I should separate them?
>> If it's In 1 single collection, when there are updatesto only the
>> supplier info,the product info will be index again even though there
>> is noupdates to them, Is my reasoning valid?
>>
>>
>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>> Do they have the same fields or different fields? Are they updated
>>> separately or together?
>>>
>>> If they have the same fields and are updated together, I\u2019d put them
>>> in the same collection. Otherwise, probably separate.
> Walter's statements are right on the money, you just might need a little
> more detail.
>
> There are are two critical details that decide whether you even CAN
> combine different data in a single index: One is that all types of
> records must use the same field (the uniqueKey field) to determine
> uniqueness, and the value of this field must be unique across the entire
> dataset.  The other is that there SHOULD be a field with a name like
> "type" that your search client can use to differentiate the different
> kinds of documents.  This type field is not necessary, but it does make
> things easier.
>
> Assuming you CAN combine documents, there is still the question of
> whether you SHOULD.  If the fields that you will commonly search are the
> same between the different kinds of documents, and if people want to be
> able to do one search and get more than one of the document types you
> are indexing, then it is something you should consider.  If people will
> only ever search one type of document, you should probably keep them in
> separate indexes to keep things cleaner.
>
> Thanks,
> Shawn
>
>


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: 1 main collection or multiple smaller collections?

Posted by Shawn Heisey <ap...@elyograg.org>.
On 4/26/2017 11:57 PM, Derek Poh wrote:
> There are some common fields between them.
> At the source data end (database), the supplier info and product info
> are updated separately. In this regard, I should separate them?
> If it's In 1 single collection, when there are updatesto only the
> supplier info,the product info will be index again even though there
> is noupdates to them, Is my reasoning valid?
>
>
> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>> Do they have the same fields or different fields? Are they updated
>> separately or together?
>>
>> If they have the same fields and are updated together, I\u2019d put them
>> in the same collection. Otherwise, probably separate. 

Walter's statements are right on the money, you just might need a little
more detail.

There are are two critical details that decide whether you even CAN
combine different data in a single index: One is that all types of
records must use the same field (the uniqueKey field) to determine
uniqueness, and the value of this field must be unique across the entire
dataset.  The other is that there SHOULD be a field with a name like
"type" that your search client can use to differentiate the different
kinds of documents.  This type field is not necessary, but it does make
things easier.

Assuming you CAN combine documents, there is still the question of
whether you SHOULD.  If the fields that you will commonly search are the
same between the different kinds of documents, and if people want to be
able to do one search and get more than one of the document types you
are indexing, then it is something you should consider.  If people will
only ever search one type of document, you should probably keep them in
separate indexes to keep things cleaner.

Thanks,
Shawn


Re: 1 main collection or multiple smaller collections?

Posted by Derek Poh <dp...@globalsources.com>.
There are some common fields between them.
At the source data end (database), the supplier info and product info 
are updated separately. In this regard, I should separate them?
If it's In 1 single collection, when there are updatesto only the 
supplier info,the product info will be index again even though there is 
noupdates to them, Is my reasoning valid?


On 4/27/2017 1:33 PM, Walter Underwood wrote:
> Do they have the same fields or different fields? Are they updated separately or together?
>
> If they have the same fields and are updated together, I\u2019d put them in the same collection. Otherwise, probably separate.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>> On Apr 26, 2017, at 10:25 PM, Derek Poh <dp...@globalsources.com> wrote:
>>
>> Hi
>> I amplanning for a migration of a legacy searchengine to Solr.
>> Basically thedata can be categorisedinto suppliersinfo, suppliers products info and products category info. These sets of data are related to each other.
>> suppliers products data, which is the largest, have around 300,000 records currentlyand projected to increase.
>>
>> Should I put these data in 1 single collection or in separate collections - eg. 1 collection for suppliers info, 1 collection for suppliers products infoand 1 collection fo products categories info?
>> What should I consider and plan for when deciding which option to take?
>>
>> Derek
>>
>> ----------------------
>> CONFIDENTIALITY NOTICE
>> This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part.
>> This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.
>


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: 1 main collection or multiple smaller collections?

Posted by Walter Underwood <wu...@wunderwood.org>.
Do they have the same fields or different fields? Are they updated separately or together?

If they have the same fields and are updated together, I’d put them in the same collection. Otherwise, probably separate.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 26, 2017, at 10:25 PM, Derek Poh <dp...@globalsources.com> wrote:
> 
> Hi
> I amplanning for a migration of a legacy searchengine to Solr.
> Basically thedata can be categorisedinto suppliersinfo, suppliers products info and products category info. These sets of data are related to each other.
> suppliers products data, which is the largest, have around 300,000 records currentlyand projected to increase.
> 
> Should I put these data in 1 single collection or in separate collections - eg. 1 collection for suppliers info, 1 collection for suppliers products infoand 1 collection fo products categories info?
> What should I consider and plan for when deciding which option to take?
> 
> Derek
> 
> ----------------------
> CONFIDENTIALITY NOTICE 
> This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 
> This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.