You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Amit Makwana <AM...@capitalnovus.com> on 2017/08/04 10:10:31 UTC
Fw: Indexing issue in Solrcloud


Hello,

We are facing one indexing issue in solrcloud.

Solr version: 6.3.0

Solr Cloud Configuration:

  *    3 node zookeeper with 3 solr instance.
  *   1 collection with 3 shards and 2 replicas.

[cid:8c971c4f-43e8-4b32-90c6-281b065301f9]

*
        Indexing 10000 document
        Indexed docs Shard1=3393, Shard2=3351, Shard3=3256

Testing:

*         Down any leader replica when indexing is running.

*   Connection refuse exception generated continuously in sendUpdateStream method of class ConcurrentUpdateSolrClient.

*    Display 10000 records successfully indexed, while i execute all docs (*:*) query, solr return only 6607 documents other 3393 documents not return in result.
Example:

            *    We gave dataimport command from devm50:8091, devm50:8091 solr instance is responsible for document routing it will have indexed some of               the document in its local index and some of them it will send to dedicated shards.

            *    Suppose we down devm50:8092 solr instance, it contains 1 leader replica of shard1.so it will select devm50:8093 as leader replica.

            *    Data import screen display 10000 documents indexed/processed. But while searching only 6607 document were indexed. And while searching in               shard1 0 documents are indexed 3393 document of shard1 is not indexed. And Connection refused Exception is generated.



         Dataimport:

[cid:3ca59bce-ce43-4afd-827d-9f737cc652d5]


        *:* Query:

[cid:774a1643-bba4-4478-9bb5-dcd5cf237034]


       *:* Query in Shard1:

[cid:e8b99dd1-692b-4456-81a8-10346a363c57]

From debugging code we come to know about that,


            *         In ConcurrentUpdateSolrClient class document send to dedicated shard and its replica by solrj request and added in Runner queue and these                request are run by scheduler later on.

            *         When any leader goes down, runner queue may have many requests that point to old leader and it is down, it will try to send that request to old                leader and got connection refuse exception. Because we have not modified any pending request though new leader is selected for that shard                but runner queue is holding client request that point to old leader not new leader.

            *         There is no code for handling this issue, if exception generated from ConcurrentUpdateSolrClient's sendInputStream(), it is not handle back to                it's caller.

        We have tried to solve this issue by modifying ConcurrentUpdateSolrClient class.

            *          Added code to modify current client request when IOException generated in ConcurrentUpdateSolrClient's sendInputStream method and                resubmit current request to new leader node.

            *         By modifying code we are able to indexed some failed document but, some documents are still missing now getting 9850 document while                searching.


Please find attachment for log


Amit Makwana | Software Engineear

CAPITAL NOVUS

Governance  |  Compliance  |  eDiscovery
A-501, APPL, IT-SEZ,
K Raheja Road, Koba, Gandhinagar: 382009
Office: 079.65721500 | Extn: 1646
AMakwana@capitalnovus.com<ma...@capitalnovus.com> | www.capitalnovus.com<http://www.capitalnovus.com/>

Washington, DC | New York | London | Paris | Gandhinagar | Tokyo
The information contained in this email message may be confidential or legally privileged. If you are not the intended recipient, please advise the sender by replying to this email and by immediately deleting all copies of this message and any attachments. Capital Legal Solutions, LLC d/b/a Capital Novus is not authorized to practice law or provide legal services. Its services are limited to the non-legal, administrative aspects of document review and discovery projects.