You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Marc Brette (Jira)" <ji...@apache.org> on 2022/03/21 10:51:00 UTC
[jira] [Created] (SOLR-16108) Incorrect distribution of records in shards after a split with splitByKeyprefix,when using the CompositeId router with a router field defined
Marc Brette created SOLR-16108:
----------------------------------
Summary: Incorrect distribution of records in shards after a split with splitByKeyprefix,when using the CompositeId router with a router field defined
Key: SOLR-16108
URL: https://issues.apache.org/jira/browse/SOLR-16108
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: SolrCloud
Affects Versions: 8.4
Reporter: Marc Brette
When a collection is created using the CompositeId router with a router field defined, and one of its shard contains records with the same routing key, and a split of its shard is performed with splitByKeyprefix parameter, we expect the records to be uniformly distributed between the two resulting shards.
Instead, one shard contains no record, the other contains all the records.
Steps to reproduce:
{code:java}
docker network create solr-network
# run in one terminal
docker run -it -h solr1 --name solr1 --net solr-network -p 18983:8983 solr:8.4 /opt/solr/bin/solr -c -f
# run in another terminal
docker run -it -h solr2 --name solr2 --net solr-network -p 28983:8983 solr:8.4 /opt/solr/bin/solr -c -f -z solr1:9983
#-----------------------------------------------------------------------------------------------
# Works, documents are split between the 2 shards
# Create collection with default compositeId router, routing key in the id, only one shard
curl --request GET \
--url 'http://localhost:18983/solr/admin/collections?action=CREATE&name=routing_by_id&numShards=1'
# Create enough documents, they all have the same routing key (france!)
for i in {0..100}
do
curl --request POST \
--url http://localhost:18983/solr/routing_by_id/update/json/docs?commit=true \
--header 'Content-Type: application/json' \
--data "[{
\"id\": \"france\!${i}0\",
\"title_t\": \"hi\"
}]"
done
# Check it is indexed correctly
curl --request GET \
--url 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*'
# Split the shard
curl --request GET \
--url 'http://localhost:18983/solr/admin/collections?action=SPLITSHARD&collection=routing_by_id&shard=shard1&splitByPrefix=true'
# Check records in shard1_0 (~half of the documents there)
curl --request GET \
--url 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*&shards=shard1_0'
# Check records in shard1_1(~half of the documents there)
curl --request GET \
--url 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*&shards=shard1_1'
#-----------------------------------------------------------------------------------------------
# Fails, does not split documents in both shards
# Create collection with default compositeId router, routing key in the field "route_t", only one shard
curl --request GET \
--url 'http://localhost:18983/solr/admin/collections?action=CREATE&name=routing_by_field&numShards=1&router.field=route_t'
# Create enough documents, they all have the same routing key (france!)
for i in {0..100}
do
curl --request POST \
--url http://localhost:18983/solr/routing_by_field/update/json/docs?commit=true \
--header 'Content-Type: application/json' \
--data "[{
\"id\": \"${i}0\",
\"title_t\": \"hi\",
\"route_t\": \"france\"
}]"
done
# Check it is indexed correctly
curl --request GET \
--url 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*'
# Split the shard
curl --request GET \
--url 'http://localhost:18983/solr/admin/collections?action=SPLITSHARD&collection=routing_by_field&shard=shard1&splitByPrefix=true'
# Check records in shard1_0: no document!
curl --request GET \
--url 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*&shards=shard1_0'
# Check records in shard1_1: all documents!
curl --request GET \
--url 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*&shards=shard1_1'
{code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org