You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "James Dyer (JIRA)" <ji...@apache.org> on 2017/05/01 16:52:04 UTC
[jira] [Commented] (SOLR-10522) Duplicate keys in "collations"
object with JSON response format
[ https://issues.apache.org/jira/browse/SOLR-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991076#comment-15991076 ]
James Dyer commented on SOLR-10522:
-----------------------------------
We might need to re-think our work with SOLR-9972. My apologizes [~cpoerschke] in that when I reviewed SOLR-9972, I hadn't realized we had more than 1 json format and that fixing one might break the others.
prior to SOLR-9972, our "flat" (default) json looked like this for the collation section:
{noformat}
"collations":[
"collation",{
"collationQuery":"lowerfilt:(+faith +hope +loaves)",
"hits":1,
"misspellingsAndCorrections":[
"fauth","faith",
"home","hope",
"loane","loaves"]},
"collation",{
"collationQuery":"lowerfilt:(+faith +hope +love)",
"hits":1,
"misspellingsAndCorrections":[
"fauth","faith",
"home","hope",
"loane","love"]}]
{noformat}
...by having "collations" as a NamedList, we avoid having duplicate keys with "collation". But the "arrntv" format chokes around the "collationQuery":
{noformat}
"collations":
[
{"name":"collation",{
"type":"str","value":"collationQuery":"lowerfilt:(+faith +hope +loaves)",
"hits":1,
"misspellingsAndCorrections":
[
{"name":"fauth","type":"str","value":"faith"},
{"name":"home","type":"str","value":"hope"},
{"name":"loane","type":"str","value":"loaves"}]}},
{"name":"collation",{
"type":"str","value":"collationQuery":"lowerfilt:(+faith +hope +love)",
"hits":1,
"misspellingsAndCorrections":
[
{"name":"fauth","type":"str","value":"faith"},
{"name":"home","type":"str","value":"hope"},
{"name":"loane","type":"str","value":"love"}]}}]
{noformat}
...So SOLR-9972 changed "collations" to be a SimpleOrderedMap. Now we get this for "arrntv":
{noformat}
"collations":{
"collation":{
"collationQuery":"lowerfilt:(+faith +hope +loaves)",
"hits":1,
"misspellingsAndCorrections":
[
{"name":"fauth","type":"str","value":"faith"},
{"name":"home","type":"str","value":"hope"},
{"name":"loane","type":"str","value":"loaves"}]},
"collation":{
"collationQuery":"lowerfilt:(+faith +hope +love)",
"hits":1,
"misspellingsAndCorrections":
[
{"name":"fauth","type":"str","value":"faith"},
{"name":"home","type":"str","value":"hope"},
{"name":"loane","type":"str","value":"love"}]}}
{noformat}
...so now it renders valid json. But under "collations", we have duplicate keys, right? If there is more than 1 collation, the "collation" key keeps getting overwritten.
So then, it seems that SOLR-9972 is only a partial fix for "arrntv" because while we have valid json, there are duplicate keys. But worse, SOLR-9972 broke the default json format, both from a backwards-compatibility standpoint, and also from a correctness standpoint as this is also subject to duplicate keys.
I'd think reverting SOLR-9972 would leave us in a better situation than the current one. But can someone suggest a solution that would result in:
- valid json for all the various json formats we support
- no duplicate keys when there are multiple collations
- no breaking backwards compatibility until 7.0, except for the completely-broken "arrntv" case ? (6.5 changes notwithstanding, breaking backwards here was a bug in my opinion).
??
> Duplicate keys in "collations" object with JSON response format
> ---------------------------------------------------------------
>
> Key: SOLR-10522
> URL: https://issues.apache.org/jira/browse/SOLR-10522
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: spellchecker
> Affects Versions: 6.5
> Reporter: Nikita Pchelintsev
> Assignee: James Dyer
> Priority: Minor
>
> After upgrading Solr 6.3 -> 6.5 I've noticed a change in how json response writer outputs "collations" response key when spellchecking is enabled (wt=json&json.nl=arrarr)
> Solr 6.3:
> "collations":
> [
> ["collation",{
> "collationQuery":"the",
> "hits":48,
> "maxScore":"30.282",
> "misspellingsAndCorrections":
> [
> ["thea","the"]]}],
> ["collation",{
> "collationQuery":"tea",
> "hits":3,
> "maxScore":"2.936",
> "misspellingsAndCorrections":
> [
> ["thea","tea"]]}],
> ...
> Solr 6.5:
> "collations":{
> "collation":{
> "collationQuery":"the",
> "hits":43,
> "misspellingsAndCorrections":
> [
> ["thea","the"]]},
> "collation":{
> "collationQuery":"tea",
> "hits":3,
> "misspellingsAndCorrections":
> [
> ["thea","tea"]]},
> ...
> Solr 6.5 outputs object instead of an array, and it has duplicate keys which is not valid for JSON format.
> Any help is appreciated.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org