You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by dmitri maziuk <dm...@gmail.com> on 2022/10/28 20:12:56 UTC

8.11 docs "Sending JSON Update Commands" bug?

The example in TFM is a JSON array that contains two "add" keys and two 
"delete" keys and so can't be generated in any rational programming 
language:

'''
url -X POST -H 'Content-Type: application/json' 
'http://localhost:8983/solr/my_collection/update' --data-binary '
{
   "add": {
     "doc": {
       "id": "DOC1",
       "my_field": 2.3,
       "my_multivalued_field": [ "aaa", "bbb" ]
     }
   },
   "add": {
     "commitWithin": 5000,
     "overwrite": false,
     "doc": {
       "f1": "v1",
       "f1": "v2"
     }
   },

   "commit": {},
   "optimize": { "waitSearcher":false },

   "delete": { "id":"ID" },
   "delete": { "query":"QUERY" }
}'
'''

Is that a documentation bug and the outer level is actually meant to be 
a list (of what?), or does it really expect the above string?

Anybody?

Dima


Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by dmitri maziuk <dm...@gmail.com>.
On 2022-10-31 4:55 PM, Thomas Corthals wrote:
> Solr does deviate from the 'does not assign any significance to the
> ordering of name/value pairs' part of that spec though. The order of "add"s
> and "delete"s within an update request does matter.

Yeah, that's the other problem with using a dict (python-speak) to send 
a *sequence* of commands: keys of a dict are not intrinsically ordered 
so the implementation is free to run, say, a commit first, the all the 
adds, and then the deletes.

But since that string cannot be demarshalled into a dict, that makes it 
all right I guess.

Dima



Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by Thomas Corthals <th...@klascement.net>.
Solr does deviate from the 'does not assign any significance to the
ordering of name/value pairs' part of that spec though. The order of "add"s
and "delete"s within an update request does matter.

Thomas

Op ma 31 okt. 2022 om 21:50 schreef Walter Underwood <wunder@wunderwood.org
>:

> Duplicate keys are somewhat surprising, but absolutely allowed and always
> have been.
>
> From the ECMA JSON spec:
>
> "The JSON syntax does not impose any restrictions on the strings used as
> names, does not require that name strings be unique, and does not assign
> any significance to the ordering of name/value pairs.”
>
>
> https://www.ecma-international.org/publications-and-standards/standards/ecma-404/
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Oct 31, 2022, at 1:42 PM, Adam Constabaris <aj...@ncsu.edu.INVALID>
> wrote:
> >
> > I don't know if there's a generally accepted name for it (but see
> > https://en.wikipedia.org/wiki/JSON_streaming) -- when you're using JSON
> to
> > pass around large numbers of objects, it's nice to be able to treat the
> > data as "just a bunch of records" that you can process one by one as they
> > arrive rather than having to read a very large array of objects into
> memory
> > which you then process.  These various kinds of "JSON serialization" see
> a
> > fair amount of use in the wild, including within Solr.
> >
> > cheers,
> >
> > AC
> >
> >
> > On Fri, Oct 28, 2022 at 6:01 PM dmitri maziuk <dm...@gmail.com>
> > wrote:
> >
> >> On 2022-10-28 4:26 PM, Mikhail Khludnev wrote:
> >>> Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
> >>> unique.
> >>
> >> Right, just like a parachute _should_ but not ought to open on your way
> >> down.
> >>
> >>> You can think about streaming writer or reader in any rational
> >> programming
> >>> language.
> >>
> >> Of course I can hand-write the string to be any kind of garbage I want,
> >> and then hand-write a parser to read it. But JSON stands for JavaScript
> >> Object Notation. If a string can't be demarshalled into a valid
> >> JavaScript Object, it goes *splat* on the ground.
> >>
> >> Dima
> >>
> >>
>
>

Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by Walter Underwood <wu...@wunderwood.org>.
Duplicate keys are somewhat surprising, but absolutely allowed and always have been.

From the ECMA JSON spec:

"The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs.”

https://www.ecma-international.org/publications-and-standards/standards/ecma-404/

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 31, 2022, at 1:42 PM, Adam Constabaris <aj...@ncsu.edu.INVALID> wrote:
> 
> I don't know if there's a generally accepted name for it (but see
> https://en.wikipedia.org/wiki/JSON_streaming) -- when you're using JSON to
> pass around large numbers of objects, it's nice to be able to treat the
> data as "just a bunch of records" that you can process one by one as they
> arrive rather than having to read a very large array of objects into memory
> which you then process.  These various kinds of "JSON serialization" see a
> fair amount of use in the wild, including within Solr.
> 
> cheers,
> 
> AC
> 
> 
> On Fri, Oct 28, 2022 at 6:01 PM dmitri maziuk <dm...@gmail.com>
> wrote:
> 
>> On 2022-10-28 4:26 PM, Mikhail Khludnev wrote:
>>> Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
>>> unique.
>> 
>> Right, just like a parachute _should_ but not ought to open on your way
>> down.
>> 
>>> You can think about streaming writer or reader in any rational
>> programming
>>> language.
>> 
>> Of course I can hand-write the string to be any kind of garbage I want,
>> and then hand-write a parser to read it. But JSON stands for JavaScript
>> Object Notation. If a string can't be demarshalled into a valid
>> JavaScript Object, it goes *splat* on the ground.
>> 
>> Dima
>> 
>> 


Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by Adam Constabaris <aj...@ncsu.edu.INVALID>.
I don't know if there's a generally accepted name for it (but see
https://en.wikipedia.org/wiki/JSON_streaming) -- when you're using JSON to
pass around large numbers of objects, it's nice to be able to treat the
data as "just a bunch of records" that you can process one by one as they
arrive rather than having to read a very large array of objects into memory
which you then process.  These various kinds of "JSON serialization" see a
fair amount of use in the wild, including within Solr.

cheers,

AC


On Fri, Oct 28, 2022 at 6:01 PM dmitri maziuk <dm...@gmail.com>
wrote:

> On 2022-10-28 4:26 PM, Mikhail Khludnev wrote:
> > Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
> > unique.
>
> Right, just like a parachute _should_ but not ought to open on your way
> down.
>
> > You can think about streaming writer or reader in any rational
> programming
> > language.
>
> Of course I can hand-write the string to be any kind of garbage I want,
> and then hand-write a parser to read it. But JSON stands for JavaScript
> Object Notation. If a string can't be demarshalled into a valid
> JavaScript Object, it goes *splat* on the ground.
>
> Dima
>
>

Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by dmitri maziuk <dm...@gmail.com>.
On 2022-10-28 4:26 PM, Mikhail Khludnev wrote:
> Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
> unique.

Right, just like a parachute _should_ but not ought to open on your way 
down.

> You can think about streaming writer or reader in any rational programming
> language.

Of course I can hand-write the string to be any kind of garbage I want, 
and then hand-write a parser to read it. But JSON stands for JavaScript 
Object Notation. If a string can't be demarshalled into a valid 
JavaScript Object, it goes *splat* on the ground.

Dima


Re: 8.11 docs "Sending JSON Update Commands" bug?

Posted by Mikhail Khludnev <mk...@apache.org>.
Well, Dmitry. Turns out keys in JSON _should be_ but not ought to be
unique.
You can think about streaming writer or reader in any rational programming
language.
If we talk about DOM style parsing we are not limited by a map, but also
can afford a bag or fabulous Solr's NamedList-ha!
I know what it looks like, but it is not significantly more shocking than
http GET with a body payload that's quite obvious in search.


On Fri, Oct 28, 2022 at 11:13 PM dmitri maziuk <dm...@gmail.com>
wrote:

> The example in TFM is a JSON array that contains two "add" keys and two
> "delete" keys and so can't be generated in any rational programming
> language:
>
> '''
> url -X POST -H 'Content-Type: application/json'
> 'http://localhost:8983/solr/my_collection/update' --data-binary '
> {
>    "add": {
>      "doc": {
>        "id": "DOC1",
>        "my_field": 2.3,
>        "my_multivalued_field": [ "aaa", "bbb" ]
>      }
>    },
>    "add": {
>      "commitWithin": 5000,
>      "overwrite": false,
>      "doc": {
>        "f1": "v1",
>        "f1": "v2"
>      }
>    },
>
>    "commit": {},
>    "optimize": { "waitSearcher":false },
>
>    "delete": { "id":"ID" },
>    "delete": { "query":"QUERY" }
> }'
> '''
>
> Is that a documentation bug and the outer level is actually meant to be
> a list (of what?), or does it really expect the above string?
>
> Anybody?
>
> Dima
>
>

-- 
Sincerely yours
Mikhail Khludnev