You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by mganeshs <mg...@live.in> on 2018/02/16 04:19:11 UTC

In Place Updates not work as expected

All,

I have (say 1M, in real time it would be more even) solr documents which has
lot of fields and it's bit huge. We have a functionality, where we need to
go and update a specific field or add new field in to that document. Since
we have to do this for all 1M documents, it's taking up more time and it's
not acceptable. 

So we thought of using "In Place Updates".

As per documentation, we have made sure it's following this criteria
-------------------------------
*An atomic update operation is performed using this approach only when the
fields to be updated meet these three conditions:

are non-indexed (indexed="false"), non-stored (stored="false"), single
valued (multiValued="false") numeric docValues (docValues="true") fields;

the _version_ field is also a non-indexed, non-stored single valued
docValues field; and,

copy targets of updated fields, if any, are also non-indexed, non-stored
single valued numeric docValues fields.*
-------------------------------
To check whether it's working as expected, 
* First we tried to update a normal field and it took around 1.5 Hours to
update all 1M docs, as the complete documents is getting re-indexed.

* We also tried to update the docvalue field and it also took around 1.5
hours to complete for 1M docs. 

As in the second case, we are updating docvalue field type, and as it won't
re-index the complete document, isn't that should take lesser time ? 

What could be going wrong ? I am using Sorl 6.5.1. Is this a bug or expected
behavior ? 

Regards,




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: In Place Updates not work as expected

Posted by Emir Arnautović <em...@sematext.com>.
Hi,
That’s how you build regular document. Incremental/atomic updates need to use update commands. 
Did not check latest Solrj, so maybe there is built in way of doing that, but quick googling showed how it can be achieved:

 SolrInputDocument doc2 = new SolrInputDocument();
    Map<String,String> fpValue2 = new HashMap<String, String>();
    fpValue2.put("add","fp2");        
    doc2.setField("FACTURES_PRODUIT", fpValue2);
HTH,
Emir 
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 16 Mar 2018, at 15:36, mganeshs <mg...@live.in> wrote:
> 
> Hi Emir,
> 
> It's normal setfield and addDocument
> 
> for ex.
> in a for loop 
>   solrInputDocument.setField(sFieldId, fieldValue);
> and after this, we add the created document.
>   solrClient.add(collectionName, solrInputDocuments);
> 
> I just want to know whether, we need to do something specific for in-place
> updates ? 
> 
> Kindly let me know,
> 
> Regards,
> 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: In Place Updates not work as expected

Posted by mganeshs <mg...@live.in>.
Hi Emir,

It's normal setfield and addDocument

for ex.
in a for loop 
   solrInputDocument.setField(sFieldId, fieldValue);
and after this, we add the created document.
   solrClient.add(collectionName, solrInputDocuments);

I just want to know whether, we need to do something specific for in-place
updates ? 

Kindly let me know,

Regards,




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: In Place Updates not work as expected

Posted by Emir Arnautović <em...@sematext.com>.
Hi,
Can you share part of code where you prepare update.

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 14 Mar 2018, at 15:27, mganeshs <mg...@live.in> wrote:
> 
> Hi Emir,
> 
> I am using solrj to update the document. Is there any spl API to be used for
> in place Updates ? 
> 
> Yes are we are updating in Batch of 1000 documents. 
> 
> As I mentioned before, since I am updating only docvalues i expect it should
> update in faster than updating normal field. Isn't it ?
> 
> Regards,
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: In Place Updates not work as expected

Posted by Shawn Heisey <el...@elyograg.org>.
On 3/14/2018 8:27 AM, mganeshs wrote:
> As I mentioned before, since I am updating only docvalues i expect it should
> update in faster than updating normal field. Isn't it ?

Maybe.  But not always.

To do an in-place update, Solr must rewrite the docValues data for that 
field in that segment.  It must write this data for *EVERY* document in 
that segment which has that field.

If the segment has a few documents, this will almost certainly be very 
fast.  But if the segment with the document you are updating is large 
and contains enough documents that there are a million unique values for 
that field, then Solr is going to have to gather those million values 
from the existing data, then write a new file containing a million 
values.  This isn't going to be fast ... and a standard update probably 
would be faster.

Segments are part of the organization of a Lucene index.  Solr is a 
Lucene application.

Thanks,
Shawn


Re: In Place Updates not work as expected

Posted by mganeshs <mg...@live.in>.
Hi Emir,

I am using solrj to update the document. Is there any spl API to be used for
in place Updates ? 

Yes are we are updating in Batch of 1000 documents. 

As I mentioned before, since I am updating only docvalues i expect it should
update in faster than updating normal field. Isn't it ?

Regards,



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: In Place Updates not work as expected

Posted by Emir Arnautović <em...@sematext.com>.
Hi,
Did you confirm that it actually does in place update? In case of in place update, after update (maybe try single) only doc values file should change (if my understanding is right).
Do you update a full document or some test doc with a single field?
Do you batch updates or send one by one?

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 16 Feb 2018, at 05:19, mganeshs <mg...@live.in> wrote:
> 
> All,
> 
> I have (say 1M, in real time it would be more even) solr documents which has
> lot of fields and it's bit huge. We have a functionality, where we need to
> go and update a specific field or add new field in to that document. Since
> we have to do this for all 1M documents, it's taking up more time and it's
> not acceptable. 
> 
> So we thought of using "In Place Updates".
> 
> As per documentation, we have made sure it's following this criteria
> -------------------------------
> *An atomic update operation is performed using this approach only when the
> fields to be updated meet these three conditions:
> 
> are non-indexed (indexed="false"), non-stored (stored="false"), single
> valued (multiValued="false") numeric docValues (docValues="true") fields;
> 
> the _version_ field is also a non-indexed, non-stored single valued
> docValues field; and,
> 
> copy targets of updated fields, if any, are also non-indexed, non-stored
> single valued numeric docValues fields.*
> -------------------------------
> To check whether it's working as expected, 
> * First we tried to update a normal field and it took around 1.5 Hours to
> update all 1M docs, as the complete documents is getting re-indexed.
> 
> * We also tried to update the docvalue field and it also took around 1.5
> hours to complete for 1M docs. 
> 
> As in the second case, we are updating docvalue field type, and as it won't
> re-index the complete document, isn't that should take lesser time ? 
> 
> What could be going wrong ? I am using Sorl 6.5.1. Is this a bug or expected
> behavior ? 
> 
> Regards,
> 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html