You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shai Erera <se...@gmail.com> on 2015/11/10 13:54:22 UTC
Inconsistent error returned by Solr when indexing bad documents
Hi,
I wanted to test the error message that Solr returns when indexing a
document with an unknown field. Surprisingly, I get different errors,
depending if the request hits the shard's leader or not.
To reproduce (5.3.1):
bin/solr -e cloud
ports: 8983, 7574
config: basic_configs
shards: 1
replicas: 2
Wait for the nodes to come up and issue a CLUSTERSTATUS call to check which
replica is the leader. In my case, 7574 was the leader. Now index a
document with an unknown field:
curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json -d
'[{"id" : "1", "unknown" : "foo"}]'
And you get back
{"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
Request\n\n\n\nrequest:
http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
","code":400}}
But if you execute:
curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json -d
'[{"id" : "1", "unknown" : "foo"}]'
Then you get back
{"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR: [doc=1]
unknown field 'unknown'","code":400}}
In both cases you get back 400, but if the request hits the leader you get
a more expressive error message. Is there any reason for that behavior?
Can't the replica just pass along the error that it got from the leader?
Shai
Re: Inconsistent error returned by Solr when indexing bad documents
Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello,
Here is my attempt. It might be too heavy but it proves the problem at
least.
https://paste.apache.org/dnHa
On Wed, Nov 11, 2015 at 8:59 AM, Shai Erera <se...@gmail.com> wrote:
> OK just wanted to clarify that this happens even when indexing a single
> document, using curl or SolrJ:
>
> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json/docs
> -d '{"id" : "1", "unknown" : "foo"}'
>
> So it's not only with bulk updates.
>
> I will nevertheless reproduce this in a proper unit test.
>
> Shai
>
> On Tue, Nov 10, 2015 at 6:31 PM, Mark Miller <ma...@gmail.com>
> wrote:
>
>> It's kind of tricky. And it doesn't help that we don't have real per
>> update errors when doing bulk or streaming.
>>
>> But you can start looking around DistributedUpdateProcessor,
>> SolrCmdDistributor, StreamingSolrClients for update error propagation
>> 'stuff'.
>>
>> - Mark
>>
>> On Tue, Nov 10, 2015 at 10:39 AM Shai Erera <se...@gmail.com> wrote:
>>
>>> Thanks Mark, I wrote a test for it, I can port it to Solr's
>>> test-framework. Can you also give me a hint in what area of the code I
>>> should look to fix it?
>>>
>>> Shai
>>>
>>> On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com>
>>> wrote:
>>>
>>>> It's not properly propagating the root error it loos. We probably need
>>>> a test for that.
>>>>
>>>> - Mark
>>>>
>>>> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I wanted to test the error message that Solr returns when indexing a
>>>>> document with an unknown field. Surprisingly, I get different errors,
>>>>> depending if the request hits the shard's leader or not.
>>>>>
>>>>> To reproduce (5.3.1):
>>>>>
>>>>> bin/solr -e cloud
>>>>> ports: 8983, 7574
>>>>> config: basic_configs
>>>>> shards: 1
>>>>> replicas: 2
>>>>>
>>>>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>>>>> which replica is the leader. In my case, 7574 was the leader. Now
>>>>> index a document with an unknown field:
>>>>>
>>>>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json
>>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>>
>>>>> And you get back
>>>>>
>>>>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>>>>> Request\n\n\n\nrequest:
>>>>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>>>>> ","code":400}}
>>>>>
>>>>> But if you execute:
>>>>>
>>>>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json
>>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>>
>>>>> Then you get back
>>>>>
>>>>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
>>>>> [doc=1] unknown field 'unknown'","code":400}}
>>>>>
>>>>> In both cases you get back 400, but if the request hits the leader you
>>>>> get a more expressive error message. Is there any reason for that behavior?
>>>>> Can't the replica just pass along the error that it got from the leader?
>>>>>
>>>>>
>>>>> Shai
>>>>>
>>>> --
>>>> - Mark
>>>> about.me/markrmiller
>>>>
>>>
>>> --
>> - Mark
>> about.me/markrmiller
>>
>
>
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
<http://www.griddynamics.com>
<mk...@griddynamics.com>
Re: Inconsistent error returned by Solr when indexing bad documents
Posted by Mark Miller <ma...@gmail.com>.
Yeah, the error handling is the same regardless. The error handling code is
tricky to follow because of batch and streaming - but they dont have
special handling vs single update.
- Mark
On Wed, Nov 11, 2015 at 1:00 AM Shai Erera <se...@gmail.com> wrote:
> OK just wanted to clarify that this happens even when indexing a single
> document, using curl or SolrJ:
>
> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json/docs
> -d '{"id" : "1", "unknown" : "foo"}'
>
> So it's not only with bulk updates.
>
> I will nevertheless reproduce this in a proper unit test.
>
> Shai
>
> On Tue, Nov 10, 2015 at 6:31 PM, Mark Miller <ma...@gmail.com>
> wrote:
>
>> It's kind of tricky. And it doesn't help that we don't have real per
>> update errors when doing bulk or streaming.
>>
>> But you can start looking around DistributedUpdateProcessor,
>> SolrCmdDistributor, StreamingSolrClients for update error propagation
>> 'stuff'.
>>
>> - Mark
>>
>> On Tue, Nov 10, 2015 at 10:39 AM Shai Erera <se...@gmail.com> wrote:
>>
>>> Thanks Mark, I wrote a test for it, I can port it to Solr's
>>> test-framework. Can you also give me a hint in what area of the code I
>>> should look to fix it?
>>>
>>> Shai
>>>
>>> On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com>
>>> wrote:
>>>
>>>> It's not properly propagating the root error it loos. We probably need
>>>> a test for that.
>>>>
>>>> - Mark
>>>>
>>>> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I wanted to test the error message that Solr returns when indexing a
>>>>> document with an unknown field. Surprisingly, I get different errors,
>>>>> depending if the request hits the shard's leader or not.
>>>>>
>>>>> To reproduce (5.3.1):
>>>>>
>>>>> bin/solr -e cloud
>>>>> ports: 8983, 7574
>>>>> config: basic_configs
>>>>> shards: 1
>>>>> replicas: 2
>>>>>
>>>>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>>>>> which replica is the leader. In my case, 7574 was the leader. Now
>>>>> index a document with an unknown field:
>>>>>
>>>>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json
>>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>>
>>>>> And you get back
>>>>>
>>>>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>>>>> Request\n\n\n\nrequest:
>>>>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>>>>> ","code":400}}
>>>>>
>>>>> But if you execute:
>>>>>
>>>>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json
>>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>>
>>>>> Then you get back
>>>>>
>>>>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
>>>>> [doc=1] unknown field 'unknown'","code":400}}
>>>>>
>>>>> In both cases you get back 400, but if the request hits the leader you
>>>>> get a more expressive error message. Is there any reason for that behavior?
>>>>> Can't the replica just pass along the error that it got from the leader?
>>>>>
>>>>>
>>>>> Shai
>>>>>
>>>> --
>>>> - Mark
>>>> about.me/markrmiller
>>>>
>>>
>>> --
>> - Mark
>> about.me/markrmiller
>>
>
> --
- Mark
about.me/markrmiller
Re: Inconsistent error returned by Solr when indexing bad documents
Posted by Shai Erera <se...@gmail.com>.
OK just wanted to clarify that this happens even when indexing a single
document, using curl or SolrJ:
curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json/docs
-d '{"id" : "1", "unknown" : "foo"}'
So it's not only with bulk updates.
I will nevertheless reproduce this in a proper unit test.
Shai
On Tue, Nov 10, 2015 at 6:31 PM, Mark Miller <ma...@gmail.com> wrote:
> It's kind of tricky. And it doesn't help that we don't have real per
> update errors when doing bulk or streaming.
>
> But you can start looking around DistributedUpdateProcessor,
> SolrCmdDistributor, StreamingSolrClients for update error propagation
> 'stuff'.
>
> - Mark
>
> On Tue, Nov 10, 2015 at 10:39 AM Shai Erera <se...@gmail.com> wrote:
>
>> Thanks Mark, I wrote a test for it, I can port it to Solr's
>> test-framework. Can you also give me a hint in what area of the code I
>> should look to fix it?
>>
>> Shai
>>
>> On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com>
>> wrote:
>>
>>> It's not properly propagating the root error it loos. We probably need a
>>> test for that.
>>>
>>> - Mark
>>>
>>> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I wanted to test the error message that Solr returns when indexing a
>>>> document with an unknown field. Surprisingly, I get different errors,
>>>> depending if the request hits the shard's leader or not.
>>>>
>>>> To reproduce (5.3.1):
>>>>
>>>> bin/solr -e cloud
>>>> ports: 8983, 7574
>>>> config: basic_configs
>>>> shards: 1
>>>> replicas: 2
>>>>
>>>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>>>> which replica is the leader. In my case, 7574 was the leader. Now
>>>> index a document with an unknown field:
>>>>
>>>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json
>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>
>>>> And you get back
>>>>
>>>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>>>> Request\n\n\n\nrequest:
>>>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>>>> ","code":400}}
>>>>
>>>> But if you execute:
>>>>
>>>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json
>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>
>>>> Then you get back
>>>>
>>>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
>>>> [doc=1] unknown field 'unknown'","code":400}}
>>>>
>>>> In both cases you get back 400, but if the request hits the leader you
>>>> get a more expressive error message. Is there any reason for that behavior?
>>>> Can't the replica just pass along the error that it got from the leader?
>>>>
>>>>
>>>> Shai
>>>>
>>> --
>>> - Mark
>>> about.me/markrmiller
>>>
>>
>> --
> - Mark
> about.me/markrmiller
>
Re: Inconsistent error returned by Solr when indexing bad documents
Posted by Mark Miller <ma...@gmail.com>.
It's kind of tricky. And it doesn't help that we don't have real per update
errors when doing bulk or streaming.
But you can start looking around DistributedUpdateProcessor,
SolrCmdDistributor, StreamingSolrClients for update error propagation
'stuff'.
- Mark
On Tue, Nov 10, 2015 at 10:39 AM Shai Erera <se...@gmail.com> wrote:
> Thanks Mark, I wrote a test for it, I can port it to Solr's
> test-framework. Can you also give me a hint in what area of the code I
> should look to fix it?
>
> Shai
>
> On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com>
> wrote:
>
>> It's not properly propagating the root error it loos. We probably need a
>> test for that.
>>
>> - Mark
>>
>> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I wanted to test the error message that Solr returns when indexing a
>>> document with an unknown field. Surprisingly, I get different errors,
>>> depending if the request hits the shard's leader or not.
>>>
>>> To reproduce (5.3.1):
>>>
>>> bin/solr -e cloud
>>> ports: 8983, 7574
>>> config: basic_configs
>>> shards: 1
>>> replicas: 2
>>>
>>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>>> which replica is the leader. In my case, 7574 was the leader. Now index
>>> a document with an unknown field:
>>>
>>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json
>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>
>>> And you get back
>>>
>>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>>> Request\n\n\n\nrequest:
>>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>>> ","code":400}}
>>>
>>> But if you execute:
>>>
>>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json
>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>
>>> Then you get back
>>>
>>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
>>> [doc=1] unknown field 'unknown'","code":400}}
>>>
>>> In both cases you get back 400, but if the request hits the leader you
>>> get a more expressive error message. Is there any reason for that behavior?
>>> Can't the replica just pass along the error that it got from the leader?
>>>
>>>
>>> Shai
>>>
>> --
>> - Mark
>> about.me/markrmiller
>>
>
> --
- Mark
about.me/markrmiller
Re: Inconsistent error returned by Solr when indexing bad documents
Posted by Shai Erera <se...@gmail.com>.
Thanks Mark, I wrote a test for it, I can port it to Solr's test-framework.
Can you also give me a hint in what area of the code I should look to fix
it?
Shai
On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com> wrote:
> It's not properly propagating the root error it loos. We probably need a
> test for that.
>
> - Mark
>
> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>
>> Hi,
>>
>> I wanted to test the error message that Solr returns when indexing a
>> document with an unknown field. Surprisingly, I get different errors,
>> depending if the request hits the shard's leader or not.
>>
>> To reproduce (5.3.1):
>>
>> bin/solr -e cloud
>> ports: 8983, 7574
>> config: basic_configs
>> shards: 1
>> replicas: 2
>>
>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>> which replica is the leader. In my case, 7574 was the leader. Now index
>> a document with an unknown field:
>>
>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json -d
>> '[{"id" : "1", "unknown" : "foo"}]'
>>
>> And you get back
>>
>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>> Request\n\n\n\nrequest:
>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>> ","code":400}}
>>
>> But if you execute:
>>
>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json -d
>> '[{"id" : "1", "unknown" : "foo"}]'
>>
>> Then you get back
>>
>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR: [doc=1]
>> unknown field 'unknown'","code":400}}
>>
>> In both cases you get back 400, but if the request hits the leader you
>> get a more expressive error message. Is there any reason for that behavior?
>> Can't the replica just pass along the error that it got from the leader?
>>
>>
>> Shai
>>
> --
> - Mark
> about.me/markrmiller
>
Re: Inconsistent error returned by Solr when indexing bad documents
Posted by Mark Miller <ma...@gmail.com>.
It's not properly propagating the root error it loos. We probably need a
test for that.
- Mark
On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
> Hi,
>
> I wanted to test the error message that Solr returns when indexing a
> document with an unknown field. Surprisingly, I get different errors,
> depending if the request hits the shard's leader or not.
>
> To reproduce (5.3.1):
>
> bin/solr -e cloud
> ports: 8983, 7574
> config: basic_configs
> shards: 1
> replicas: 2
>
> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
> which replica is the leader. In my case, 7574 was the leader. Now index a
> document with an unknown field:
>
> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json -d
> '[{"id" : "1", "unknown" : "foo"}]'
>
> And you get back
>
> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
> Request\n\n\n\nrequest:
> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
> ","code":400}}
>
> But if you execute:
>
> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json -d
> '[{"id" : "1", "unknown" : "foo"}]'
>
> Then you get back
>
> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR: [doc=1]
> unknown field 'unknown'","code":400}}
>
> In both cases you get back 400, but if the request hits the leader you get
> a more expressive error message. Is there any reason for that behavior?
> Can't the replica just pass along the error that it got from the leader?
>
>
> Shai
>
--
- Mark
about.me/markrmiller