You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by dmitri maziuk <dm...@gmail.com> on 2022/10/27 22:44:21 UTC

HTTP errors POSTing to 8.11.2

Hi all,

has anyone gone through the exercise of replacing Data Import Handler 
with scripts that POST JSON and if so, are your scripts still working OK 
with 8.11.2?

I got a few that work fine with 6.5 and 8.7 but are throwing 503s and 
occasional 400s all over the place with 8.11.2.

Solr isn't logging anything useful (at least not at INFO level) and I 
can't quite figure out what's up. I tried tweaking the scripts in all 
kinds of ways but that isn't helping. The same VM previously ran 6.5 
where these problems didn't exist. So I am inclined to blame 8.11.2 at 
this point.

Is there some new security settings I need to tweak to POST or 
something? Any other suggestions?

TIA
Dima

SOLVED Re: HTTP errors POSTing to 8.11.2

Posted by dmitri maziuk <dm...@gmail.com>.
Maybe solved: looking closer at the instances that worked, I realized 
that their NSSM service ran a bat file that called solr.cmd with "-f". I 
started solr.cmd directly from NSSM and forgot the "-f".

Changing the service to run a batch file appears to have fixed it. 
Whether it was the "-f" or CMD or both, it's not broken now and that's 
good enough for me.

Dima


Re: HTTP errors POSTing to 8.11.2

Posted by Thomas Corthals <th...@klascement.net>.
Op vr 28 okt. 2022 om 17:37 schreef dmitri maziuk <dm...@gmail.com>:

> It's a clean stand-alone install. It's not going through any proxies,
> the scripts are erroring out when run on the same server too, and it
> being python, the complete http conversation is a bit hard to get to.


Hi Dima

Whenever I need the full HTTP conversation, I use Wireshark for that.
Capture on the interface, filter on port number to easily find the reqest,
right-click on the request to Follow HTTP Stream.

Thomas

Re: HTTP errors POSTing to 8.11.2

Posted by dmitri maziuk <dm...@gmail.com>.
On 2022-10-28 10:02 AM, Shawn Heisey wrote:

> If you have already checked solr.log and don't see anything, that's very 
> odd.  Is the server in cloud mode and part of a cluster?  If it is, then 
> the error might be logged by a different server.  Can you share the 
> entire HTTP error responses?

It's a clean stand-alone install. It's not going through any proxies, 
the scripts are erroring out when run on the same server too, and it 
being python, the complete http conversation is a bit hard to get to.

There are, however, 2 more nodes that with this one together were part 
of a 6.5 cluster... maybe I should check if 6.5 instances are still up 
on them and kill them if so. I don't see why this one would be spilling 
over to them but it's an easy thing to check.

Thanks,
Dima


Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by dmitri maziuk <dm...@gmail.com>.
On 2022-10-31 4:55 PM, Thomas Corthals wrote:
> Solr does deviate from the 'does not assign any significance to the
> ordering of name/value pairs' part of that spec though. The order of "add"s
> and "delete"s within an update request does matter.

Yeah, that's the other problem with using a dict (python-speak) to send 
a *sequence* of commands: keys of a dict are not intrinsically ordered 
so the implementation is free to run, say, a commit first, the all the 
adds, and then the deletes.

But since that string cannot be demarshalled into a dict, that makes it 
all right I guess.

Dima



Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by Thomas Corthals <th...@klascement.net>.
Solr does deviate from the 'does not assign any significance to the
ordering of name/value pairs' part of that spec though. The order of "add"s
and "delete"s within an update request does matter.

Thomas

Op ma 31 okt. 2022 om 21:50 schreef Walter Underwood <wunder@wunderwood.org
>:

> Duplicate keys are somewhat surprising, but absolutely allowed and always
> have been.
>
> From the ECMA JSON spec:
>
> "The JSON syntax does not impose any restrictions on the strings used as
> names, does not require that name strings be unique, and does not assign
> any significance to the ordering of name/value pairs.”
>
>
> https://www.ecma-international.org/publications-and-standards/standards/ecma-404/
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Oct 31, 2022, at 1:42 PM, Adam Constabaris <aj...@ncsu.edu.INVALID>
> wrote:
> >
> > I don't know if there's a generally accepted name for it (but see
> > https://en.wikipedia.org/wiki/JSON_streaming) -- when you're using JSON
> to
> > pass around large numbers of objects, it's nice to be able to treat the
> > data as "just a bunch of records" that you can process one by one as they
> > arrive rather than having to read a very large array of objects into
> memory
> > which you then process.  These various kinds of "JSON serialization" see
> a
> > fair amount of use in the wild, including within Solr.
> >
> > cheers,
> >
> > AC
> >
> >
> > On Fri, Oct 28, 2022 at 6:01 PM dmitri maziuk <dm...@gmail.com>
> > wrote:
> >
> >> On 2022-10-28 4:26 PM, Mikhail Khludnev wrote:
> >>> Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
> >>> unique.
> >>
> >> Right, just like a parachute _should_ but not ought to open on your way
> >> down.
> >>
> >>> You can think about streaming writer or reader in any rational
> >> programming
> >>> language.
> >>
> >> Of course I can hand-write the string to be any kind of garbage I want,
> >> and then hand-write a parser to read it. But JSON stands for JavaScript
> >> Object Notation. If a string can't be demarshalled into a valid
> >> JavaScript Object, it goes *splat* on the ground.
> >>
> >> Dima
> >>
> >>
>
>

Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by Walter Underwood <wu...@wunderwood.org>.
Duplicate keys are somewhat surprising, but absolutely allowed and always have been.

From the ECMA JSON spec:

"The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs.”

https://www.ecma-international.org/publications-and-standards/standards/ecma-404/

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 31, 2022, at 1:42 PM, Adam Constabaris <aj...@ncsu.edu.INVALID> wrote:
> 
> I don't know if there's a generally accepted name for it (but see
> https://en.wikipedia.org/wiki/JSON_streaming) -- when you're using JSON to
> pass around large numbers of objects, it's nice to be able to treat the
> data as "just a bunch of records" that you can process one by one as they
> arrive rather than having to read a very large array of objects into memory
> which you then process.  These various kinds of "JSON serialization" see a
> fair amount of use in the wild, including within Solr.
> 
> cheers,
> 
> AC
> 
> 
> On Fri, Oct 28, 2022 at 6:01 PM dmitri maziuk <dm...@gmail.com>
> wrote:
> 
>> On 2022-10-28 4:26 PM, Mikhail Khludnev wrote:
>>> Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
>>> unique.
>> 
>> Right, just like a parachute _should_ but not ought to open on your way
>> down.
>> 
>>> You can think about streaming writer or reader in any rational
>> programming
>>> language.
>> 
>> Of course I can hand-write the string to be any kind of garbage I want,
>> and then hand-write a parser to read it. But JSON stands for JavaScript
>> Object Notation. If a string can't be demarshalled into a valid
>> JavaScript Object, it goes *splat* on the ground.
>> 
>> Dima
>> 
>> 


Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by Adam Constabaris <aj...@ncsu.edu.INVALID>.
I don't know if there's a generally accepted name for it (but see
https://en.wikipedia.org/wiki/JSON_streaming) -- when you're using JSON to
pass around large numbers of objects, it's nice to be able to treat the
data as "just a bunch of records" that you can process one by one as they
arrive rather than having to read a very large array of objects into memory
which you then process.  These various kinds of "JSON serialization" see a
fair amount of use in the wild, including within Solr.

cheers,

AC


On Fri, Oct 28, 2022 at 6:01 PM dmitri maziuk <dm...@gmail.com>
wrote:

> On 2022-10-28 4:26 PM, Mikhail Khludnev wrote:
> > Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
> > unique.
>
> Right, just like a parachute _should_ but not ought to open on your way
> down.
>
> > You can think about streaming writer or reader in any rational
> programming
> > language.
>
> Of course I can hand-write the string to be any kind of garbage I want,
> and then hand-write a parser to read it. But JSON stands for JavaScript
> Object Notation. If a string can't be demarshalled into a valid
> JavaScript Object, it goes *splat* on the ground.
>
> Dima
>
>

Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by dmitri maziuk <dm...@gmail.com>.
On 2022-10-28 4:26 PM, Mikhail Khludnev wrote:
> Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
> unique.

Right, just like a parachute _should_ but not ought to open on your way 
down.

> You can think about streaming writer or reader in any rational programming
> language.

Of course I can hand-write the string to be any kind of garbage I want, 
and then hand-write a parser to read it. But JSON stands for JavaScript 
Object Notation. If a string can't be demarshalled into a valid 
JavaScript Object, it goes *splat* on the ground.

Dima


Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by Mikhail Khludnev <mk...@apache.org>.
Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
unique.
You can think about streaming writer or reader in any rational programming
language.
If we talk about DOM style parsing we are not limited by a map, but also
can afford a bag or fabulous Solr's NamedList-ha!
I know what it looks like, but it is not significantly more shocking than
http GET with a body payload that's quite obvious in search.


On Fri, Oct 28, 2022 at 11:13 PM dmitri maziuk <dm...@gmail.com>
wrote:

> The example in TFM is a JSON array that contains two "add" keys and two
> "delete" keys and so can't be generated in any rational programming
> language:
>
> '''
> url -X POST -H 'Content-Type: application/json'
> 'http://localhost:8983/solr/my_collection/update' --data-binary '
> {
>    "add": {
>      "doc": {
>        "id": "DOC1",
>        "my_field": 2.3,
>        "my_multivalued_field": [ "aaa", "bbb" ]
>      }
>    },
>    "add": {
>      "commitWithin": 5000,
>      "overwrite": false,
>      "doc": {
>        "f1": "v1",
>        "f1": "v2"
>      }
>    },
>
>    "commit": {},
>    "optimize": { "waitSearcher":false },
>
>    "delete": { "id":"ID" },
>    "delete": { "query":"QUERY" }
> }'
> '''
>
> Is that a documentation bug and the outer level is actually meant to be
> a list (of what?), or does it really expect the above string?
>
> Anybody?
>
> Dima
>
>

-- 
Sincerely yours
Mikhail Khludnev

8.11 docs "Sending JSON Update Commands" bug?

Posted by dmitri maziuk <dm...@gmail.com>.
The example in TFM is a JSON array that contains two "add" keys and two 
"delete" keys and so can't be generated in any rational programming 
language:

'''
url -X POST -H 'Content-Type: application/json' 
'http://localhost:8983/solr/my_collection/update' --data-binary '
{
   "add": {
     "doc": {
       "id": "DOC1",
       "my_field": 2.3,
       "my_multivalued_field": [ "aaa", "bbb" ]
     }
   },
   "add": {
     "commitWithin": 5000,
     "overwrite": false,
     "doc": {
       "f1": "v1",
       "f1": "v2"
     }
   },

   "commit": {},
   "optimize": { "waitSearcher":false },

   "delete": { "id":"ID" },
   "delete": { "query":"QUERY" }
}'
'''

Is that a documentation bug and the outer level is actually meant to be 
a list (of what?), or does it really expect the above string?

Anybody?

Dima


Re: HTTP errors POSTing to 8.11.2

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/27/22 16:44, dmitri maziuk wrote:
> has anyone gone through the exercise of replacing Data Import Handler 
> with scripts that POST JSON and if so, are your scripts still working 
> OK with 8.11.2?
>
> I got a few that work fine with 6.5 and 8.7 but are throwing 503s and 
> occasional 400s all over the place with 8.11.2.

I converted an entire build system from using DIH requests in Perl to 
SolrJ in Java.  Wasn't using JSON, but the idea is similar. For full 
rebuilds, it still used DIH, but all those calls were made with SolrJ.  
That system was NOT in cloud mode.

Normally a 503 error will not be generated by Solr.  The only places I 
found in the code that do this are when a health check fails or "Server 
is shutting down or failed to initialize".

If neither of those situations are present, then any 503 error is 
probably generated by a reverse proxy, not Solr.  Examples of this are 
haproxy, nginx, and apache httpd.

Do you have the whole error text, including any stacktraces?

> Solr isn't logging anything useful (at least not at INFO level) and I 
> can't quite figure out what's up. I tried tweaking the scripts in all 
> kinds of ways but that isn't helping. The same VM previously ran 6.5 
> where these problems didn't exist. So I am inclined to blame 8.11.2 at 
> this point.

If you have already checked solr.log and don't see anything, that's very 
odd.  Is the server in cloud mode and part of a cluster?  If it is, then 
the error might be logged by a different server.  Can you share the 
entire HTTP error responses?

Thanks,
Shawn


Re: HTTP errors POSTing to 8.11.2

Posted by Andy Lester <an...@petdance.com>.

> On Oct 27, 2022, at 5:44 PM, dmitri maziuk <dm...@gmail.com> wrote:
> 
> has anyone gone through the exercise of replacing Data Import Handler with scripts that POST JSON and if so, are your scripts still working OK with 8.11.2?

That's exactly what I've done a couple of years ago and they work just fine on our install of 8.11.1.