You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Remy Loubradou <re...@gmail.com> on 2011/07/20 18:16:55 UTC

Wiki Error JSON syntax

Hi,
I was writing a Solr Client API for Node and I found an error on this page
http://wiki.apache.org/solr/UpdateJSON ,on the section "Update Commands" the
JSON is not valid because there are duplicate keys and two times with "add"
and "delete".I tried with an array and it doesn't work as well, I got error
400, I think that's because the syntax is bad.

I don't really know if I am at the good place to talk about that but ...
that the only place I found. Sorry if it's not.

Thanks,

And I love Solr :)

Re: Wiki Error JSON syntax

Posted by Remy Loubradou <re...@gmail.com>.
Hi,

2011/7/25 Gabriel Farrell <gs...@gmail.com>

> On Mon, Jul 25, 2011 at 12:24 PM, Stefan Matheis
> <ma...@googlemail.com> wrote:
> > Hi Remy,
> >
> > so you may open an Issue for this on the github Project? i mean .. just
> > creating another client, because i have one problem, does not sound like
> a
> > good plan?
>
> Agreed, and thanks for calling my attention to this thread, Stefan.
>

Yes I'm agree too, I'm not used to, pull, submit issue it's new for me :). I
will publish an issue.

>
> > Regards
> > Stefan
> >
> > Am 25.07.2011 10:56, schrieb Remy Loubradou:
> >>
> >> Hey Stephan,
> >>
> >> Thanks, but I already used this solr client and I got an error when I
> add
> >> too much documents "FATAL ERROR: JS Allocation failed - process out of
> >> memory".
> >> I didn't find the source of the problem in the solr client. So I decided
> >> to
> >> write my own without this error hopefully and also I'm using JSON
> >> documents
> >> and not XML documents. I read a post saying that I can get better
> >> performance using JSON documents.
> >>
> >> I will release this client as an npm module.
>
> How many documents are you attempting to add at once when you get that
> error? Would it possible to chunk them into smaller groups?
>

I add a document and commit after. But it's a very XML document ~130MB and
that's happening only with big XML file.
I got this error FATAL ERROR: JS Allocation failed - process out of memory .


>
> I'm happy to work with you on enhancing node-solr to meet your needs.
> The only reason updates are via XML rather than JSON is that 3.1 was
> new or not yet released (don't quite remember which) when I first
> wrote node-solr. Even now I imagine many people may still be using a
> version of Solr that doesn't handle JSON updates. Maybe a flag
> parameter could be added to the Client object to switch from XML to
> JSON?
>

Yes, this could be a good solution. So we can try to merge your solr-client
with my client.
Before to release my client I want to have full test coverage with vows
(have some fun :) )

>
> The node-solr client has fairly complete test coverage, a history of
> commits from the Node community, and  versions aligning with several
> versions of Node. I would appreciate your contributions, either via
> issues or pull requests.
>

Cool. I will be pleased to do that!

>
> >> Regards,
> >> Remy
> >>
> >> 2011/7/25 Stefan Matheis<ma...@googlemail.com>
> >>
> >>> Remy,
> >>>
> >>> didn't use it myself .. but you know about
> https://github.com/gsf/node-**
> >>> solr<https://github.com/gsf/node-solr>  ?
> >>>
> >>> Regards
> >>> Stefan
> >>>
> >>> Am 20.07.2011 20:05, schrieb Remy Loubradou:
> >>>
> >>>  I think I can trust you but this is weird.
> >>>>
> >>>> Funny things if you try to validate on http://jsonlint.com/ this
> JSON,
> >>>> duplicates keys are automatically removed. But the thing is, how can
> you
> >>>> possibly generate this json with Javascript Object?
> >>>>
> >>>> It will be really nice to combine both ways that you show on the page.
> >>>> Something like:
> >>>>
> >>>> {
> >>>>     "add": [
> >>>>         {
> >>>>             "doc": {
> >>>>                 "id": "DOC1",
> >>>>                 "my_boosted_field": {
> >>>>                     "boost": 2.3,
> >>>>                     "value": "test"
> >>>>                 },
> >>>>                 "my_multivalued_field": [
> >>>>                     "aaa",
> >>>>                     "bbb"
> >>>>                 ]
> >>>>             }
> >>>>         },
> >>>>         {
> >>>>             "commitWithin": 5000,
> >>>>             "overwrite": false,
> >>>>             "boost": 3.45,
> >>>>             "doc": {
> >>>>                 "f1": "v2"
> >>>>             }
> >>>>         }
> >>>>     ],
> >>>>     "commit": {},
> >>>>     "optimize": {
> >>>>         "waitFlush": false,
> >>>>         "waitSearcher": false
> >>>>     },
> >>>>     "delete": [
> >>>>         {
> >>>>             "id": "ID"
> >>>>         },
> >>>>         {
> >>>>             "query": "QUERY"
> >>>>         }
> >>>>     ]
> >>>> }
> >>>>
> >>>> Thanks you for you previous response Yonik.
> >>>>
> >>>> 2011/7/20 Yonik
> >>>> Seeley<yo...@lucidimagination.com>
> >>>>>
> >>>>
> >>>>  On Wed, Jul 20, 2011 at 12:16 PM, Remy Loubradou
> >>>>>
> >>>>> <re...@gmail.com>   wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>> I was writing a Solr Client API for Node and I found an error on
> this
> >>>>>>
> >>>>> page
> >>>>>
> >>>>>>
> >>>>>> http://wiki.apache.org/solr/**UpdateJSON<
> http://wiki.apache.org/solr/UpdateJSON>,on
> >>>>>> the section "Update Commands"
> >>>>>>
> >>>>> the
> >>>>>
> >>>>>> JSON is not valid because there are duplicate keys and two times
> with
> >>>>>>
> >>>>> "add"
> >>>>>
> >>>>>> and "delete".
> >>>>>>
> >>>>>
> >>>>> It's a common misconception that it's invalid JSON.  Duplicate keys
> >>>>> are in fact legal.
>
> I can't resist addressing this side conversation.
>
> While I understand the desire for a straightforward mapping between
> the XML and JSON update formats, I think the use of duplicate keys is
> a bad idea. As noted in the spec
> (http://www.ietf.org/rfc/rfc4627.txt), "The names within an object
> SHOULD be unique." I'm not sure the reasons here justify ignoring that
> recommendation.
>
> The fact that you need to keep reminding people that duplicate names
> are legal is a sign that it's more trouble than it's worth. Also, most
> JSON parsers just punt on duplicate names (see the third paragraph in
> "A word about design" at
>
> http://planet.plt-scheme.org/package-source/dherman/json.plt/3/0/planet-docs/json/index.html
> for one take on the situation). I really don't want to write a new
> JavaScript JSON parser just for node-solr.
>

Totally agree,

Regards,
Remy

Re: Wiki Error JSON syntax

Posted by Gabriel Farrell <gs...@gmail.com>.
On Mon, Jul 25, 2011 at 12:24 PM, Stefan Matheis
<ma...@googlemail.com> wrote:
> Hi Remy,
>
> so you may open an Issue for this on the github Project? i mean .. just
> creating another client, because i have one problem, does not sound like a
> good plan?

Agreed, and thanks for calling my attention to this thread, Stefan.

> Regards
> Stefan
>
> Am 25.07.2011 10:56, schrieb Remy Loubradou:
>>
>> Hey Stephan,
>>
>> Thanks, but I already used this solr client and I got an error when I add
>> too much documents "FATAL ERROR: JS Allocation failed - process out of
>> memory".
>> I didn't find the source of the problem in the solr client. So I decided
>> to
>> write my own without this error hopefully and also I'm using JSON
>> documents
>> and not XML documents. I read a post saying that I can get better
>> performance using JSON documents.
>>
>> I will release this client as an npm module.

How many documents are you attempting to add at once when you get that
error? Would it possible to chunk them into smaller groups?

I'm happy to work with you on enhancing node-solr to meet your needs.
The only reason updates are via XML rather than JSON is that 3.1 was
new or not yet released (don't quite remember which) when I first
wrote node-solr. Even now I imagine many people may still be using a
version of Solr that doesn't handle JSON updates. Maybe a flag
parameter could be added to the Client object to switch from XML to
JSON?

The node-solr client has fairly complete test coverage, a history of
commits from the Node community, and  versions aligning with several
versions of Node. I would appreciate your contributions, either via
issues or pull requests.

>> Regards,
>> Remy
>>
>> 2011/7/25 Stefan Matheis<ma...@googlemail.com>
>>
>>> Remy,
>>>
>>> didn't use it myself .. but you know about https://github.com/gsf/node-**
>>> solr<https://github.com/gsf/node-solr>  ?
>>>
>>> Regards
>>> Stefan
>>>
>>> Am 20.07.2011 20:05, schrieb Remy Loubradou:
>>>
>>>  I think I can trust you but this is weird.
>>>>
>>>> Funny things if you try to validate on http://jsonlint.com/ this JSON,
>>>> duplicates keys are automatically removed. But the thing is, how can you
>>>> possibly generate this json with Javascript Object?
>>>>
>>>> It will be really nice to combine both ways that you show on the page.
>>>> Something like:
>>>>
>>>> {
>>>>     "add": [
>>>>         {
>>>>             "doc": {
>>>>                 "id": "DOC1",
>>>>                 "my_boosted_field": {
>>>>                     "boost": 2.3,
>>>>                     "value": "test"
>>>>                 },
>>>>                 "my_multivalued_field": [
>>>>                     "aaa",
>>>>                     "bbb"
>>>>                 ]
>>>>             }
>>>>         },
>>>>         {
>>>>             "commitWithin": 5000,
>>>>             "overwrite": false,
>>>>             "boost": 3.45,
>>>>             "doc": {
>>>>                 "f1": "v2"
>>>>             }
>>>>         }
>>>>     ],
>>>>     "commit": {},
>>>>     "optimize": {
>>>>         "waitFlush": false,
>>>>         "waitSearcher": false
>>>>     },
>>>>     "delete": [
>>>>         {
>>>>             "id": "ID"
>>>>         },
>>>>         {
>>>>             "query": "QUERY"
>>>>         }
>>>>     ]
>>>> }
>>>>
>>>> Thanks you for you previous response Yonik.
>>>>
>>>> 2011/7/20 Yonik
>>>> Seeley<yo...@lucidimagination.com>
>>>>>
>>>>
>>>>  On Wed, Jul 20, 2011 at 12:16 PM, Remy Loubradou
>>>>>
>>>>> <re...@gmail.com>   wrote:
>>>>>
>>>>>> Hi,
>>>>>> I was writing a Solr Client API for Node and I found an error on this
>>>>>>
>>>>> page
>>>>>
>>>>>>
>>>>>> http://wiki.apache.org/solr/**UpdateJSON<http://wiki.apache.org/solr/UpdateJSON>,on
>>>>>> the section "Update Commands"
>>>>>>
>>>>> the
>>>>>
>>>>>> JSON is not valid because there are duplicate keys and two times with
>>>>>>
>>>>> "add"
>>>>>
>>>>>> and "delete".
>>>>>>
>>>>>
>>>>> It's a common misconception that it's invalid JSON.  Duplicate keys
>>>>> are in fact legal.

I can't resist addressing this side conversation.

While I understand the desire for a straightforward mapping between
the XML and JSON update formats, I think the use of duplicate keys is
a bad idea. As noted in the spec
(http://www.ietf.org/rfc/rfc4627.txt), "The names within an object
SHOULD be unique." I'm not sure the reasons here justify ignoring that
recommendation.

The fact that you need to keep reminding people that duplicate names
are legal is a sign that it's more trouble than it's worth. Also, most
JSON parsers just punt on duplicate names (see the third paragraph in
"A word about design" at
http://planet.plt-scheme.org/package-source/dherman/json.plt/3/0/planet-docs/json/index.html
for one take on the situation). I really don't want to write a new
JavaScript JSON parser just for node-solr.

Re: Wiki Error JSON syntax

Posted by Stefan Matheis <ma...@googlemail.com>.
Hi Remy,

so you may open an Issue for this on the github Project? i mean .. just 
creating another client, because i have one problem, does not sound like 
a good plan?

Regards
Stefan

Am 25.07.2011 10:56, schrieb Remy Loubradou:
> Hey Stephan,
>
> Thanks, but I already used this solr client and I got an error when I add
> too much documents "FATAL ERROR: JS Allocation failed - process out of
> memory".
> I didn't find the source of the problem in the solr client. So I decided to
> write my own without this error hopefully and also I'm using JSON documents
> and not XML documents. I read a post saying that I can get better
> performance using JSON documents.
>
> I will release this client as an npm module.
>
> Regards,
> Remy
>
> 2011/7/25 Stefan Matheis<ma...@googlemail.com>
>
>> Remy,
>>
>> didn't use it myself .. but you know about https://github.com/gsf/node-**
>> solr<https://github.com/gsf/node-solr>  ?
>>
>> Regards
>> Stefan
>>
>> Am 20.07.2011 20:05, schrieb Remy Loubradou:
>>
>>   I think I can trust you but this is weird.
>>> Funny things if you try to validate on http://jsonlint.com/ this JSON,
>>> duplicates keys are automatically removed. But the thing is, how can you
>>> possibly generate this json with Javascript Object?
>>>
>>> It will be really nice to combine both ways that you show on the page.
>>> Something like:
>>>
>>> {
>>>      "add": [
>>>          {
>>>              "doc": {
>>>                  "id": "DOC1",
>>>                  "my_boosted_field": {
>>>                      "boost": 2.3,
>>>                      "value": "test"
>>>                  },
>>>                  "my_multivalued_field": [
>>>                      "aaa",
>>>                      "bbb"
>>>                  ]
>>>              }
>>>          },
>>>          {
>>>              "commitWithin": 5000,
>>>              "overwrite": false,
>>>              "boost": 3.45,
>>>              "doc": {
>>>                  "f1": "v2"
>>>              }
>>>          }
>>>      ],
>>>      "commit": {},
>>>      "optimize": {
>>>          "waitFlush": false,
>>>          "waitSearcher": false
>>>      },
>>>      "delete": [
>>>          {
>>>              "id": "ID"
>>>          },
>>>          {
>>>              "query": "QUERY"
>>>          }
>>>      ]
>>> }
>>>
>>> Thanks you for you previous response Yonik.
>>>
>>> 2011/7/20 Yonik Seeley<yo...@lucidimagination.com>
>>>>
>>>
>>>   On Wed, Jul 20, 2011 at 12:16 PM, Remy Loubradou
>>>> <re...@gmail.com>   wrote:
>>>>
>>>>> Hi,
>>>>> I was writing a Solr Client API for Node and I found an error on this
>>>>>
>>>> page
>>>>
>>>>> http://wiki.apache.org/solr/**UpdateJSON<http://wiki.apache.org/solr/UpdateJSON>,on the section "Update Commands"
>>>>>
>>>> the
>>>>
>>>>> JSON is not valid because there are duplicate keys and two times with
>>>>>
>>>> "add"
>>>>
>>>>> and "delete".
>>>>>
>>>>
>>>> It's a common misconception that it's invalid JSON.  Duplicate keys
>>>> are in fact legal.
>>>>
>>>> -Yonik
>>>> http://www.lucidimagination.**com<http://www.lucidimagination.com>
>>>>
>>>> I tried with an array and it doesn't work as well, I got error
>>>>
>>>>> 400, I think that's because the syntax is bad.
>>>>>
>>>>> I don't really know if I am at the good place to talk about that but ...
>>>>> that the only place I found. Sorry if it's not.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> And I love Solr :)
>>>>>
>>>>>
>>>>
>>>
>

Re: Wiki Error JSON syntax

Posted by Remy Loubradou <re...@gmail.com>.
Hey Stephan,

Thanks, but I already used this solr client and I got an error when I add
too much documents "FATAL ERROR: JS Allocation failed - process out of
memory".
I didn't find the source of the problem in the solr client. So I decided to
write my own without this error hopefully and also I'm using JSON documents
and not XML documents. I read a post saying that I can get better
performance using JSON documents.

I will release this client as an npm module.

Regards,
Remy

2011/7/25 Stefan Matheis <ma...@googlemail.com>

> Remy,
>
> didn't use it myself .. but you know about https://github.com/gsf/node-**
> solr <https://github.com/gsf/node-solr> ?
>
> Regards
> Stefan
>
> Am 20.07.2011 20:05, schrieb Remy Loubradou:
>
>  I think I can trust you but this is weird.
>> Funny things if you try to validate on http://jsonlint.com/ this JSON,
>> duplicates keys are automatically removed. But the thing is, how can you
>> possibly generate this json with Javascript Object?
>>
>> It will be really nice to combine both ways that you show on the page.
>> Something like:
>>
>> {
>>     "add": [
>>         {
>>             "doc": {
>>                 "id": "DOC1",
>>                 "my_boosted_field": {
>>                     "boost": 2.3,
>>                     "value": "test"
>>                 },
>>                 "my_multivalued_field": [
>>                     "aaa",
>>                     "bbb"
>>                 ]
>>             }
>>         },
>>         {
>>             "commitWithin": 5000,
>>             "overwrite": false,
>>             "boost": 3.45,
>>             "doc": {
>>                 "f1": "v2"
>>             }
>>         }
>>     ],
>>     "commit": {},
>>     "optimize": {
>>         "waitFlush": false,
>>         "waitSearcher": false
>>     },
>>     "delete": [
>>         {
>>             "id": "ID"
>>         },
>>         {
>>             "query": "QUERY"
>>         }
>>     ]
>> }
>>
>> Thanks you for you previous response Yonik.
>>
>> 2011/7/20 Yonik Seeley<yo...@lucidimagination.com>
>> >
>>
>>  On Wed, Jul 20, 2011 at 12:16 PM, Remy Loubradou
>>> <re...@gmail.com>  wrote:
>>>
>>>> Hi,
>>>> I was writing a Solr Client API for Node and I found an error on this
>>>>
>>> page
>>>
>>>> http://wiki.apache.org/solr/**UpdateJSON<http://wiki.apache.org/solr/UpdateJSON>,on the section "Update Commands"
>>>>
>>> the
>>>
>>>> JSON is not valid because there are duplicate keys and two times with
>>>>
>>> "add"
>>>
>>>> and "delete".
>>>>
>>>
>>> It's a common misconception that it's invalid JSON.  Duplicate keys
>>> are in fact legal.
>>>
>>> -Yonik
>>> http://www.lucidimagination.**com <http://www.lucidimagination.com>
>>>
>>> I tried with an array and it doesn't work as well, I got error
>>>
>>>> 400, I think that's because the syntax is bad.
>>>>
>>>> I don't really know if I am at the good place to talk about that but ...
>>>> that the only place I found. Sorry if it's not.
>>>>
>>>> Thanks,
>>>>
>>>> And I love Solr :)
>>>>
>>>>
>>>
>>

Re: Wiki Error JSON syntax

Posted by Stefan Matheis <ma...@googlemail.com>.
Remy,

didn't use it myself .. but you know about 
https://github.com/gsf/node-solr ?

Regards
Stefan

Am 20.07.2011 20:05, schrieb Remy Loubradou:
> I think I can trust you but this is weird.
> Funny things if you try to validate on http://jsonlint.com/ this JSON,
> duplicates keys are automatically removed. But the thing is, how can you
> possibly generate this json with Javascript Object?
>
> It will be really nice to combine both ways that you show on the page.
> Something like:
>
> {
>      "add": [
>          {
>              "doc": {
>                  "id": "DOC1",
>                  "my_boosted_field": {
>                      "boost": 2.3,
>                      "value": "test"
>                  },
>                  "my_multivalued_field": [
>                      "aaa",
>                      "bbb"
>                  ]
>              }
>          },
>          {
>              "commitWithin": 5000,
>              "overwrite": false,
>              "boost": 3.45,
>              "doc": {
>                  "f1": "v2"
>              }
>          }
>      ],
>      "commit": {},
>      "optimize": {
>          "waitFlush": false,
>          "waitSearcher": false
>      },
>      "delete": [
>          {
>              "id": "ID"
>          },
>          {
>              "query": "QUERY"
>          }
>      ]
> }
>
> Thanks you for you previous response Yonik.
>
> 2011/7/20 Yonik Seeley<yo...@lucidimagination.com>
>
>> On Wed, Jul 20, 2011 at 12:16 PM, Remy Loubradou
>> <re...@gmail.com>  wrote:
>>> Hi,
>>> I was writing a Solr Client API for Node and I found an error on this
>> page
>>> http://wiki.apache.org/solr/UpdateJSON ,on the section "Update Commands"
>> the
>>> JSON is not valid because there are duplicate keys and two times with
>> "add"
>>> and "delete".
>>
>> It's a common misconception that it's invalid JSON.  Duplicate keys
>> are in fact legal.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>> I tried with an array and it doesn't work as well, I got error
>>> 400, I think that's because the syntax is bad.
>>>
>>> I don't really know if I am at the good place to talk about that but ...
>>> that the only place I found. Sorry if it's not.
>>>
>>> Thanks,
>>>
>>> And I love Solr :)
>>>
>>
>

Re: Wiki Error JSON syntax

Posted by Remy Loubradou <re...@gmail.com>.
I think I can trust you but this is weird.
Funny things if you try to validate on http://jsonlint.com/ this JSON,
duplicates keys are automatically removed. But the thing is, how can you
possibly generate this json with Javascript Object?

It will be really nice to combine both ways that you show on the page.
Something like:

{
    "add": [
        {
            "doc": {
                "id": "DOC1",
                "my_boosted_field": {
                    "boost": 2.3,
                    "value": "test"
                },
                "my_multivalued_field": [
                    "aaa",
                    "bbb"
                ]
            }
        },
        {
            "commitWithin": 5000,
            "overwrite": false,
            "boost": 3.45,
            "doc": {
                "f1": "v2"
            }
        }
    ],
    "commit": {},
    "optimize": {
        "waitFlush": false,
        "waitSearcher": false
    },
    "delete": [
        {
            "id": "ID"
        },
        {
            "query": "QUERY"
        }
    ]
}

Thanks you for you previous response Yonik.

2011/7/20 Yonik Seeley <yo...@lucidimagination.com>

> On Wed, Jul 20, 2011 at 12:16 PM, Remy Loubradou
> <re...@gmail.com> wrote:
> > Hi,
> > I was writing a Solr Client API for Node and I found an error on this
> page
> > http://wiki.apache.org/solr/UpdateJSON ,on the section "Update Commands"
> the
> > JSON is not valid because there are duplicate keys and two times with
> "add"
> > and "delete".
>
> It's a common misconception that it's invalid JSON.  Duplicate keys
> are in fact legal.
>
> -Yonik
> http://www.lucidimagination.com
>
> I tried with an array and it doesn't work as well, I got error
> > 400, I think that's because the syntax is bad.
> >
> > I don't really know if I am at the good place to talk about that but ...
> > that the only place I found. Sorry if it's not.
> >
> > Thanks,
> >
> > And I love Solr :)
> >
>

Re: Wiki Error JSON syntax

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Wed, Jul 20, 2011 at 12:16 PM, Remy Loubradou
<re...@gmail.com> wrote:
> Hi,
> I was writing a Solr Client API for Node and I found an error on this page
> http://wiki.apache.org/solr/UpdateJSON ,on the section "Update Commands" the
> JSON is not valid because there are duplicate keys and two times with "add"
> and "delete".

It's a common misconception that it's invalid JSON.  Duplicate keys
are in fact legal.

-Yonik
http://www.lucidimagination.com

I tried with an array and it doesn't work as well, I got error
> 400, I think that's because the syntax is bad.
>
> I don't really know if I am at the good place to talk about that but ...
> that the only place I found. Sorry if it's not.
>
> Thanks,
>
> And I love Solr :)
>