You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shai Erera <se...@gmail.com> on 2015/11/10 13:54:22 UTC

Inconsistent error returned by Solr when indexing bad documents

Hi,

I wanted to test the error message that Solr returns when indexing a
document with an unknown field. Surprisingly, I get different errors,
depending if the request hits the shard's leader or not.

To reproduce (5.3.1):

bin/solr -e cloud
  ports: 8983, 7574
  config: basic_configs
  shards: 1
  replicas: 2

Wait for the nodes to come up and issue a CLUSTERSTATUS call to check which
replica is the leader. In my case, 7574 was the leader. Now index a
document with an unknown field:

curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json -d
'[{"id" : "1", "unknown" : "foo"}]'

And you get back

{"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
Request\n\n\n\nrequest:
http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
","code":400}}

But if you execute:

curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json -d
'[{"id" : "1", "unknown" : "foo"}]'

Then you get back

{"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR: [doc=1]
unknown field 'unknown'","code":400}}

In both cases you get back 400, but if the request hits the leader you get
a more expressive error message. Is there any reason for that behavior?
Can't the replica just pass along the error that it got from the leader?

Shai

Re: Inconsistent error returned by Solr when indexing bad documents

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello,

Here is my attempt. It might be too heavy but it proves the problem at
least.
https://paste.apache.org/dnHa


On Wed, Nov 11, 2015 at 8:59 AM, Shai Erera <se...@gmail.com> wrote:

> OK just wanted to clarify that this happens even when indexing a single
> document, using curl or SolrJ:
>
> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json/docs
> -d '{"id" : "1", "unknown" : "foo"}'
>
> So it's not only with bulk updates.
>
> I will nevertheless reproduce this in a proper unit test.
>
> Shai
>
> On Tue, Nov 10, 2015 at 6:31 PM, Mark Miller <ma...@gmail.com>
> wrote:
>
>> It's kind of tricky. And it doesn't help that we don't have real per
>> update errors when doing bulk or streaming.
>>
>> But you can start looking around DistributedUpdateProcessor,
>> SolrCmdDistributor, StreamingSolrClients for update error propagation
>> 'stuff'.
>>
>> - Mark
>>
>> On Tue, Nov 10, 2015 at 10:39 AM Shai Erera <se...@gmail.com> wrote:
>>
>>> Thanks Mark, I wrote a test for it, I can port it to Solr's
>>> test-framework. Can you also give me a hint in what area of the code I
>>> should look to fix it?
>>>
>>> Shai
>>>
>>> On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com>
>>> wrote:
>>>
>>>> It's not properly propagating the root error it loos. We probably need
>>>> a test for that.
>>>>
>>>> - Mark
>>>>
>>>> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I wanted to test the error message that Solr returns when indexing a
>>>>> document with an unknown field. Surprisingly, I get different errors,
>>>>> depending if the request hits the shard's leader or not.
>>>>>
>>>>> To reproduce (5.3.1):
>>>>>
>>>>> bin/solr -e cloud
>>>>>   ports: 8983, 7574
>>>>>   config: basic_configs
>>>>>   shards: 1
>>>>>   replicas: 2
>>>>>
>>>>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>>>>> which replica is the leader. In my case, 7574 was the leader. Now
>>>>> index a document with an unknown field:
>>>>>
>>>>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json
>>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>>
>>>>> And you get back
>>>>>
>>>>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>>>>> Request\n\n\n\nrequest:
>>>>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>>>>> ","code":400}}
>>>>>
>>>>> But if you execute:
>>>>>
>>>>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json
>>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>>
>>>>> Then you get back
>>>>>
>>>>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
>>>>> [doc=1] unknown field 'unknown'","code":400}}
>>>>>
>>>>> In both cases you get back 400, but if the request hits the leader you
>>>>> get a more expressive error message. Is there any reason for that behavior?
>>>>> Can't the replica just pass along the error that it got from the leader?
>>>>>
>>>>>
>>>>> Shai
>>>>>
>>>> --
>>>> - Mark
>>>> about.me/markrmiller
>>>>
>>>
>>> --
>> - Mark
>> about.me/markrmiller
>>
>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>

Re: Inconsistent error returned by Solr when indexing bad documents

Posted by Mark Miller <ma...@gmail.com>.
Yeah, the error handling is the same regardless. The error handling code is
tricky to follow because of batch and streaming - but they dont have
special handling vs single update.

- Mark

On Wed, Nov 11, 2015 at 1:00 AM Shai Erera <se...@gmail.com> wrote:

> OK just wanted to clarify that this happens even when indexing a single
> document, using curl or SolrJ:
>
> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json/docs
> -d '{"id" : "1", "unknown" : "foo"}'
>
> So it's not only with bulk updates.
>
> I will nevertheless reproduce this in a proper unit test.
>
> Shai
>
> On Tue, Nov 10, 2015 at 6:31 PM, Mark Miller <ma...@gmail.com>
> wrote:
>
>> It's kind of tricky. And it doesn't help that we don't have real per
>> update errors when doing bulk or streaming.
>>
>> But you can start looking around DistributedUpdateProcessor,
>> SolrCmdDistributor, StreamingSolrClients for update error propagation
>> 'stuff'.
>>
>> - Mark
>>
>> On Tue, Nov 10, 2015 at 10:39 AM Shai Erera <se...@gmail.com> wrote:
>>
>>> Thanks Mark, I wrote a test for it, I can port it to Solr's
>>> test-framework. Can you also give me a hint in what area of the code I
>>> should look to fix it?
>>>
>>> Shai
>>>
>>> On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com>
>>> wrote:
>>>
>>>> It's not properly propagating the root error it loos. We probably need
>>>> a test for that.
>>>>
>>>> - Mark
>>>>
>>>> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I wanted to test the error message that Solr returns when indexing a
>>>>> document with an unknown field. Surprisingly, I get different errors,
>>>>> depending if the request hits the shard's leader or not.
>>>>>
>>>>> To reproduce (5.3.1):
>>>>>
>>>>> bin/solr -e cloud
>>>>>   ports: 8983, 7574
>>>>>   config: basic_configs
>>>>>   shards: 1
>>>>>   replicas: 2
>>>>>
>>>>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>>>>> which replica is the leader. In my case, 7574 was the leader. Now
>>>>> index a document with an unknown field:
>>>>>
>>>>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json
>>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>>
>>>>> And you get back
>>>>>
>>>>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>>>>> Request\n\n\n\nrequest:
>>>>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>>>>> ","code":400}}
>>>>>
>>>>> But if you execute:
>>>>>
>>>>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json
>>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>>
>>>>> Then you get back
>>>>>
>>>>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
>>>>> [doc=1] unknown field 'unknown'","code":400}}
>>>>>
>>>>> In both cases you get back 400, but if the request hits the leader you
>>>>> get a more expressive error message. Is there any reason for that behavior?
>>>>> Can't the replica just pass along the error that it got from the leader?
>>>>>
>>>>>
>>>>> Shai
>>>>>
>>>> --
>>>> - Mark
>>>> about.me/markrmiller
>>>>
>>>
>>> --
>> - Mark
>> about.me/markrmiller
>>
>
> --
- Mark
about.me/markrmiller

Re: Inconsistent error returned by Solr when indexing bad documents

Posted by Shai Erera <se...@gmail.com>.
OK just wanted to clarify that this happens even when indexing a single
document, using curl or SolrJ:

curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json/docs
-d '{"id" : "1", "unknown" : "foo"}'

So it's not only with bulk updates.

I will nevertheless reproduce this in a proper unit test.

Shai

On Tue, Nov 10, 2015 at 6:31 PM, Mark Miller <ma...@gmail.com> wrote:

> It's kind of tricky. And it doesn't help that we don't have real per
> update errors when doing bulk or streaming.
>
> But you can start looking around DistributedUpdateProcessor,
> SolrCmdDistributor, StreamingSolrClients for update error propagation
> 'stuff'.
>
> - Mark
>
> On Tue, Nov 10, 2015 at 10:39 AM Shai Erera <se...@gmail.com> wrote:
>
>> Thanks Mark, I wrote a test for it, I can port it to Solr's
>> test-framework. Can you also give me a hint in what area of the code I
>> should look to fix it?
>>
>> Shai
>>
>> On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com>
>> wrote:
>>
>>> It's not properly propagating the root error it loos. We probably need a
>>> test for that.
>>>
>>> - Mark
>>>
>>> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I wanted to test the error message that Solr returns when indexing a
>>>> document with an unknown field. Surprisingly, I get different errors,
>>>> depending if the request hits the shard's leader or not.
>>>>
>>>> To reproduce (5.3.1):
>>>>
>>>> bin/solr -e cloud
>>>>   ports: 8983, 7574
>>>>   config: basic_configs
>>>>   shards: 1
>>>>   replicas: 2
>>>>
>>>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>>>> which replica is the leader. In my case, 7574 was the leader. Now
>>>> index a document with an unknown field:
>>>>
>>>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json
>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>
>>>> And you get back
>>>>
>>>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>>>> Request\n\n\n\nrequest:
>>>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>>>> ","code":400}}
>>>>
>>>> But if you execute:
>>>>
>>>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json
>>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>>
>>>> Then you get back
>>>>
>>>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
>>>> [doc=1] unknown field 'unknown'","code":400}}
>>>>
>>>> In both cases you get back 400, but if the request hits the leader you
>>>> get a more expressive error message. Is there any reason for that behavior?
>>>> Can't the replica just pass along the error that it got from the leader?
>>>>
>>>>
>>>> Shai
>>>>
>>> --
>>> - Mark
>>> about.me/markrmiller
>>>
>>
>> --
> - Mark
> about.me/markrmiller
>

Re: Inconsistent error returned by Solr when indexing bad documents

Posted by Mark Miller <ma...@gmail.com>.
It's kind of tricky. And it doesn't help that we don't have real per update
errors when doing bulk or streaming.

But you can start looking around DistributedUpdateProcessor,
SolrCmdDistributor, StreamingSolrClients for update error propagation
'stuff'.

- Mark

On Tue, Nov 10, 2015 at 10:39 AM Shai Erera <se...@gmail.com> wrote:

> Thanks Mark, I wrote a test for it, I can port it to Solr's
> test-framework. Can you also give me a hint in what area of the code I
> should look to fix it?
>
> Shai
>
> On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com>
> wrote:
>
>> It's not properly propagating the root error it loos. We probably need a
>> test for that.
>>
>> - Mark
>>
>> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I wanted to test the error message that Solr returns when indexing a
>>> document with an unknown field. Surprisingly, I get different errors,
>>> depending if the request hits the shard's leader or not.
>>>
>>> To reproduce (5.3.1):
>>>
>>> bin/solr -e cloud
>>>   ports: 8983, 7574
>>>   config: basic_configs
>>>   shards: 1
>>>   replicas: 2
>>>
>>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>>> which replica is the leader. In my case, 7574 was the leader. Now index
>>> a document with an unknown field:
>>>
>>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json
>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>
>>> And you get back
>>>
>>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>>> Request\n\n\n\nrequest:
>>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>>> ","code":400}}
>>>
>>> But if you execute:
>>>
>>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json
>>> -d '[{"id" : "1", "unknown" : "foo"}]'
>>>
>>> Then you get back
>>>
>>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
>>> [doc=1] unknown field 'unknown'","code":400}}
>>>
>>> In both cases you get back 400, but if the request hits the leader you
>>> get a more expressive error message. Is there any reason for that behavior?
>>> Can't the replica just pass along the error that it got from the leader?
>>>
>>>
>>> Shai
>>>
>> --
>> - Mark
>> about.me/markrmiller
>>
>
> --
- Mark
about.me/markrmiller

Re: Inconsistent error returned by Solr when indexing bad documents

Posted by Shai Erera <se...@gmail.com>.
Thanks Mark, I wrote a test for it, I can port it to Solr's test-framework.
Can you also give me a hint in what area of the code I should look to fix
it?

Shai

On Tue, Nov 10, 2015 at 4:59 PM, Mark Miller <ma...@gmail.com> wrote:

> It's not properly propagating the root error it loos. We probably need a
> test for that.
>
> - Mark
>
> On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:
>
>> Hi,
>>
>> I wanted to test the error message that Solr returns when indexing a
>> document with an unknown field. Surprisingly, I get different errors,
>> depending if the request hits the shard's leader or not.
>>
>> To reproduce (5.3.1):
>>
>> bin/solr -e cloud
>>   ports: 8983, 7574
>>   config: basic_configs
>>   shards: 1
>>   replicas: 2
>>
>> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
>> which replica is the leader. In my case, 7574 was the leader. Now index
>> a document with an unknown field:
>>
>> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json -d
>> '[{"id" : "1", "unknown" : "foo"}]'
>>
>> And you get back
>>
>> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
>> Request\n\n\n\nrequest:
>> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
>> ","code":400}}
>>
>> But if you execute:
>>
>> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json -d
>> '[{"id" : "1", "unknown" : "foo"}]'
>>
>> Then you get back
>>
>> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR: [doc=1]
>> unknown field 'unknown'","code":400}}
>>
>> In both cases you get back 400, but if the request hits the leader you
>> get a more expressive error message. Is there any reason for that behavior?
>> Can't the replica just pass along the error that it got from the leader?
>>
>>
>> Shai
>>
> --
> - Mark
> about.me/markrmiller
>

Re: Inconsistent error returned by Solr when indexing bad documents

Posted by Mark Miller <ma...@gmail.com>.
It's not properly propagating the root error it loos. We probably need a
test for that.

- Mark

On Tue, Nov 10, 2015 at 7:54 AM Shai Erera <se...@gmail.com> wrote:

> Hi,
>
> I wanted to test the error message that Solr returns when indexing a
> document with an unknown field. Surprisingly, I get different errors,
> depending if the request hits the shard's leader or not.
>
> To reproduce (5.3.1):
>
> bin/solr -e cloud
>   ports: 8983, 7574
>   config: basic_configs
>   shards: 1
>   replicas: 2
>
> Wait for the nodes to come up and issue a CLUSTERSTATUS call to check
> which replica is the leader. In my case, 7574 was the leader. Now index a
> document with an unknown field:
>
> curl -i -X POST http://localhost:8983/solr/gettingstarted/update/json -d
> '[{"id" : "1", "unknown" : "foo"}]'
>
> And you get back
>
> {"responseHeader":{"status":400,"QTime":6},"error":{"msg":"Bad
> Request\n\n\n\nrequest:
> http://169.254.21.228:7574/solr/gettingstarted_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F169.254.21.228%3A8983%2Fsolr%2Fgettingstarted_shard1_replica2%2F&wt=javabin&version=2
> ","code":400}}
>
> But if you execute:
>
> curl -i -X POST http://localhost:7574/solr/gettingstarted/update/json -d
> '[{"id" : "1", "unknown" : "foo"}]'
>
> Then you get back
>
> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR: [doc=1]
> unknown field 'unknown'","code":400}}
>
> In both cases you get back 400, but if the request hits the leader you get
> a more expressive error message. Is there any reason for that behavior?
> Can't the replica just pass along the error that it got from the leader?
>
>
> Shai
>
-- 
- Mark
about.me/markrmiller