You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by "Saur, Alexandre (ELS-AMS)" <a....@elsevier.com> on 2021/10/13 09:46:15 UTC

Concurrent transactions and autocommit in Solr 8

Hi,

I have a (noobie) question about Solr 8 autocommit behaviour. This is my scenario:

- Autocommit configured in solrconfig
- ETL job that indexes thousands of documents whenever it runs

The ETL job updates the collection in the following manner: first it deletes a series of documents based on a key and then it adds new ones using the same key, with updated values (the transactions are always in this order). The job (client) does not commit its transactions (neither delete nor insert).

Given this scenario, is it possible that the delete/insert order is played differently when Solr autocommits? In other words, is it possible that the insert gets deleted?




________________________________

Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33158992, Registered in The Netherlands.

Re: Concurrent transactions and autocommit in Solr 8

Posted by "Saur, Alexandre (ELS-AMS)" <a....@elsevier.com>.
I understand your concern. I'll add more information to help clarify how the transactions are done:

- The whole indexing process (client) runs in a Spark cluster. Given one document, the process of removing/inserting is done by the same Spark executor.
- The pipes that perform removal/insertion have different Solr client instances (instances of CloudSolrClient). However, the client instances share the same HttpClient.

Not sure how (and if) this affects Solr's autocommit feature.

________________________________
From: Deepak Goel <de...@gmail.com>
Sent: 13 October 2021 12:01
To: users@solr.apache.org <us...@solr.apache.org>
Subject: Re: Concurrent transactions and autocommit in Solr 8

*** External email: use caution ***



Hello

If the insert/delete are done with two different threads (and with no
synchronization), it could be possible that the new records are getting
deleted. We might have to dig a bit into Solr code I guess.

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2Fdeicool&amp;data=04%7C01%7Ca.saur%40elsevier.com%7C79a9e8c5cc754867a9a408d98e308dbb%7C9274ee3f94254109a27f9fb15c10675d%7C0%7C0%7C637697162020572357%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=jC7jbL3OgJidgtpzyudATJv45vGSPhj836aRH2nkBkA%3D&amp;reserved=0
LinkedIn: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.linkedin.com%2Fin%2Fdeicool&amp;data=04%7C01%7Ca.saur%40elsevier.com%7C79a9e8c5cc754867a9a408d98e308dbb%7C9274ee3f94254109a27f9fb15c10675d%7C0%7C0%7C637697162020572357%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=cJzUqhRS4OmrQCU%2FiRRKo2RtxEWuxx6hr2lyl2CYSN0%3D&amp;reserved=0

"Plant a Tree, Go Green"

Make In India : https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.makeinindia.com%2Fhome&amp;data=04%7C01%7Ca.saur%40elsevier.com%7C79a9e8c5cc754867a9a408d98e308dbb%7C9274ee3f94254109a27f9fb15c10675d%7C0%7C0%7C637697162020582351%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=x%2B5LaN1wYq4%2FJlOm%2FC6sEaKafwsNt2Zdy1P4z7V5qJ8%3D&amp;reserved=0


On Wed, Oct 13, 2021 at 3:16 PM Saur, Alexandre (ELS-AMS) <
a.saur@elsevier.com> wrote:

> Hi,
>
> I have a (noobie) question about Solr 8 autocommit behaviour. This is my
> scenario:
>
> - Autocommit configured in solrconfig
> - ETL job that indexes thousands of documents whenever it runs
>
> The ETL job updates the collection in the following manner: first it
> deletes a series of documents based on a key and then it adds new ones
> using the same key, with updated values (the transactions are always in
> this order). The job (client) does not commit its transactions (neither
> delete nor insert).
>
> Given this scenario, is it possible that the delete/insert order is played
> differently when Solr autocommits? In other words, is it possible that the
> insert gets deleted?
>
>
>
>
> ________________________________
>
> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
> Netherlands, Registration No. 33158992, Registered in The Netherlands.
>

________________________________

Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33158992, Registered in The Netherlands.

Re: Concurrent transactions and autocommit in Solr 8

Posted by Deepak Goel <de...@gmail.com>.
Hello

If the insert/delete are done with two different threads (and with no
synchronization), it could be possible that the new records are getting
deleted. We might have to dig a bit into Solr code I guess.

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Wed, Oct 13, 2021 at 3:16 PM Saur, Alexandre (ELS-AMS) <
a.saur@elsevier.com> wrote:

> Hi,
>
> I have a (noobie) question about Solr 8 autocommit behaviour. This is my
> scenario:
>
> - Autocommit configured in solrconfig
> - ETL job that indexes thousands of documents whenever it runs
>
> The ETL job updates the collection in the following manner: first it
> deletes a series of documents based on a key and then it adds new ones
> using the same key, with updated values (the transactions are always in
> this order). The job (client) does not commit its transactions (neither
> delete nor insert).
>
> Given this scenario, is it possible that the delete/insert order is played
> differently when Solr autocommits? In other words, is it possible that the
> insert gets deleted?
>
>
>
>
> ________________________________
>
> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The
> Netherlands, Registration No. 33158992, Registered in The Netherlands.
>