You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Vizisa <ch...@gmail.com> on 2016/06/06 14:27:18 UTC

NRT updates

Hi,

Does number of fields in a document affect NRT updates?
I have around 1.6 million products. Each product can be available in about
3000 stores.
In addition to around 50 fields related to a product I am storing
product_store info in each product document like:
 1. Quantity of that product in each store (store_n1_count,
store_n2_count,..., store_3000_count)
 2. status of that product in each store (store_n1_status,
store_n2_status,.....store_3000_status)

I would need to do NRT update on count and status of each product, and like
that there are around 1.6 million products.

Q1. Is it okay to do NRT updates on this product collection (for each
product's store_count and store_status) with around 900 updates per second
      across the different products, (pls note that each product's status
as well as count gets updated, like that there are 1.6M products)
Q2. Is it okay using atomic updates for the NRT updates of multiple
store_counts and multiple store_status of each product and like that around
    1.6 million products in total. Or is there any other optimal way to
handle this amount of dynamic data change.
    For atomic updates I understand all fields need to be stored.
Q3. So basically can I have all this info in product collection itself or
should I store store_status info separately with productId joining them
    for the NRT scenario to work best. In that case each product_store info
is a separate document, with 3 or 4 fields only but many million
    documents (worst case 1.6M products multiplied by 3000 stores).
Q4.    When we embed all store related info in the product doc itself, a
single product doc
       can be a candidate for simultaneous updates as its count or status
can change in
       different stores at the same time. If we go for a separate
collection depicting
       product_status info, only one doc updated at a time mostly.
       Which is more efficient and optimized.?


Could some one please suggest what is optimal. Any pointers welcome.

Thanks!
Chris.

Re: NRT updates

Posted by Chris Vizisa <ch...@gmail.com>.
Hi,
Any pointers, suggestions, experiences ... please..

Thanks!
Chris.

On Mon, Jun 6, 2016 at 10:27 AM, Chris Vizisa <ch...@gmail.com>
wrote:

> Hi,
>
> Does number of fields in a document affect NRT updates?
> I have around 1.6 million products. Each product can be available in about
> 3000 stores.
> In addition to around 50 fields related to a product I am storing
> product_store info in each product document like:
>  1. Quantity of that product in each store (store_n1_count,
> store_n2_count,..., store_3000_count)
>  2. status of that product in each store (store_n1_status,
> store_n2_status,.....store_3000_status)
>
> I would need to do NRT update on count and status of each product, and
> like that there are around 1.6 million products.
>
> Q1. Is it okay to do NRT updates on this product collection (for each
> product's store_count and store_status) with around 900 updates per second
>       across the different products, (pls note that each product's status
> as well as count gets updated, like that there are 1.6M products)
> Q2. Is it okay using atomic updates for the NRT updates of multiple
> store_counts and multiple store_status of each product and like that around
>     1.6 million products in total. Or is there any other optimal way to
> handle this amount of dynamic data change.
>     For atomic updates I understand all fields need to be stored.
> Q3. So basically can I have all this info in product collection itself or
> should I store store_status info separately with productId joining them
>     for the NRT scenario to work best. In that case each product_store
> info is a separate document, with 3 or 4 fields only but many million
>     documents (worst case 1.6M products multiplied by 3000 stores).
> Q4.    When we embed all store related info in the product doc itself, a
> single product doc
>        can be a candidate for simultaneous updates as its count or status
> can change in
>        different stores at the same time. If we go for a separate
> collection depicting
>        product_status info, only one doc updated at a time mostly.
>        Which is more efficient and optimized.?
>
>
> Could some one please suggest what is optimal. Any pointers welcome.
>
> Thanks!
> Chris.
>