You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Luke Driscoll (JIRA)" <ji...@apache.org> on 2010/12/22 04:32:00 UTC

[jira] Created: (COUCHDB-995) Changes feed returns duplicate fields with include_docs=true

Changes feed returns duplicate fields with include_docs=true
------------------------------------------------------------

                 Key: COUCHDB-995
                 URL: https://issues.apache.org/jira/browse/COUCHDB-995
             Project: CouchDB
          Issue Type: Bug
          Components: Full-Text Search, HTTP Interface
    Affects Versions: 1.0.1
         Environment: MacOSX with CouchDBX 1.0.1.1 as well as homebrew couchdb 1.0.1
            Reporter: Luke Driscoll


I ran in to a problem, when using couchdb-lucene; but the problem is with couch itself.  I've found this happening both on CouchDBX 1.0.1.1 and couchdb 1.0.1 (through homebrew).

The problem is, if I update a document, and put in the same data each time, the data that comes out of the changes feed has duplicate fields.  The call: 
http://localhost:5984/test/_changes?feed=continuous&heartbeat=15000&include_docs=true&since=0

is returning data like this:
{
	"seq":356,
	"id":"encounter_83-20101218T133000.000-0700",
	"changes":[{"rev":"2-ada5250d09a364608db6cd639c213eae"}],
	"doc":{
		"_id":"encounter_83-20101218T133000.000-0700",
		"_rev":"2-ada5250d09a364608db6cd639c213eae",
		"location":{
			"organisation":{
				"name":"Some Org",
				"abbrev":"0"
			},
			"location":{
				"name":"Other Loc",
				"abbrev":"Othe"
			}
		},
		"comment":"Broken",
		"appointmentDateTime":"2010-12-18T13:30:00.000-07:00",
->		"patient_id":"patient_83",
		"appointmentType":"Acute",
->		"type":"encounter",
->		"patient_id":"patient_83",
->		"type":"encounter"
	}
}

You'll notice that the patient_id field and the type field, are being duplicated on the data return.  This is causing couchdb-lucene to baulk, but it's also just invalid json.

Thanks


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (COUCHDB-995) Changes feed returns duplicate fields with include_docs=true

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Joseph Davis resolved COUCHDB-995.
---------------------------------------

    Resolution: Won't Fix

Different parsers will do different things with duplicate fields. Python and JavaScript will both overwrite previous values. Erlang just keeps both values.

Seeing as this is the first report of the issue I don't think its that big of a deal. If it becomes a more popular complaint, we can take a look at using sets to check each field as its parsed.

> Changes feed returns duplicate fields with include_docs=true
> ------------------------------------------------------------
>
>                 Key: COUCHDB-995
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-995
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Full-Text Search, HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: MacOSX with CouchDBX 1.0.1.1 as well as homebrew couchdb 1.0.1
>            Reporter: Luke Driscoll
>
> I ran in to a problem, when using couchdb-lucene; but the problem is with couch itself.  I've found this happening both on CouchDBX 1.0.1.1 and couchdb 1.0.1 (through homebrew).
> The problem is, if I update a document, and put in the same data each time, the data that comes out of the changes feed has duplicate fields.  The call: 
> http://localhost:5984/test/_changes?feed=continuous&heartbeat=15000&include_docs=true&since=0
> is returning data like this:
> {
> 	"seq":356,
> 	"id":"encounter_83-20101218T133000.000-0700",
> 	"changes":[{"rev":"2-ada5250d09a364608db6cd639c213eae"}],
> 	"doc":{
> 		"_id":"encounter_83-20101218T133000.000-0700",
> 		"_rev":"2-ada5250d09a364608db6cd639c213eae",
> 		"location":{
> 			"organisation":{
> 				"name":"Some Org",
> 				"abbrev":"0"
> 			},
> 			"location":{
> 				"name":"Other Loc",
> 				"abbrev":"Othe"
> 			}
> 		},
> 		"comment":"Broken",
> 		"appointmentDateTime":"2010-12-18T13:30:00.000-07:00",
> ->		"patient_id":"patient_83",
> 		"appointmentType":"Acute",
> ->		"type":"encounter",
> ->		"patient_id":"patient_83",
> ->		"type":"encounter"
> 	}
> }
> You'll notice that the patient_id field and the type field, are being duplicated on the data return.  This is causing couchdb-lucene to baulk, but it's also just invalid json.
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (COUCHDB-995) Changes feed returns duplicate fields with include_docs=true

Posted by "Bogdan Artyushenko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063950#comment-13063950 ] 

Bogdan Artyushenko commented on COUCHDB-995:
--------------------------------------------

not quite sure but I have a problem of this type with couchdb 1.1.0, but if I use on the same pc 0.10 (or even 1.0.2) I have not this problem.

> Changes feed returns duplicate fields with include_docs=true
> ------------------------------------------------------------
>
>                 Key: COUCHDB-995
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-995
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Full-Text Search, HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: MacOSX with CouchDBX 1.0.1.1 as well as homebrew couchdb 1.0.1
>            Reporter: Luke Driscoll
>
> I ran in to a problem, when using couchdb-lucene; but the problem is with couch itself.  I've found this happening both on CouchDBX 1.0.1.1 and couchdb 1.0.1 (through homebrew).
> The problem is, if I update a document, and put in the same data each time, the data that comes out of the changes feed has duplicate fields.  The call: 
> http://localhost:5984/test/_changes?feed=continuous&heartbeat=15000&include_docs=true&since=0
> is returning data like this:
> {
> 	"seq":356,
> 	"id":"encounter_83-20101218T133000.000-0700",
> 	"changes":[{"rev":"2-ada5250d09a364608db6cd639c213eae"}],
> 	"doc":{
> 		"_id":"encounter_83-20101218T133000.000-0700",
> 		"_rev":"2-ada5250d09a364608db6cd639c213eae",
> 		"location":{
> 			"organisation":{
> 				"name":"Some Org",
> 				"abbrev":"0"
> 			},
> 			"location":{
> 				"name":"Other Loc",
> 				"abbrev":"Othe"
> 			}
> 		},
> 		"comment":"Broken",
> 		"appointmentDateTime":"2010-12-18T13:30:00.000-07:00",
> ->		"patient_id":"patient_83",
> 		"appointmentType":"Acute",
> ->		"type":"encounter",
> ->		"patient_id":"patient_83",
> ->		"type":"encounter"
> 	}
> }
> You'll notice that the patient_id field and the type field, are being duplicated on the data return.  This is causing couchdb-lucene to baulk, but it's also just invalid json.
> Thanks

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (COUCHDB-995) Changes feed returns duplicate fields with include_docs=true

Posted by "Luke Driscoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974031#action_12974031 ] 

Luke Driscoll commented on COUCHDB-995:
---------------------------------------

I've just done a further experiment and running a straight get and I get the same problem
http://localhost:5984/test/encounter_83-20101218T133000.000-0700
returns:
{"_id":"encounter_83-20101218T133000.000-0700","_rev":"2-ada5250d09a364608db6cd639c213eae","location":{"organization":{"name":"Allscripts Heath System","abbrev":"0"},"location":{"name":"Other Loc","abbrev":"Othe      "}},"comment":"Infertility","appointmentDateTime":"2010-12-18T13:30:00.000-07:00","patient_id":"patient_83","appointmentType":"Acute","type":"encounter","patient_id":"patient_83","type":"encounter"}


> Changes feed returns duplicate fields with include_docs=true
> ------------------------------------------------------------
>
>                 Key: COUCHDB-995
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-995
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Full-Text Search, HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: MacOSX with CouchDBX 1.0.1.1 as well as homebrew couchdb 1.0.1
>            Reporter: Luke Driscoll
>
> I ran in to a problem, when using couchdb-lucene; but the problem is with couch itself.  I've found this happening both on CouchDBX 1.0.1.1 and couchdb 1.0.1 (through homebrew).
> The problem is, if I update a document, and put in the same data each time, the data that comes out of the changes feed has duplicate fields.  The call: 
> http://localhost:5984/test/_changes?feed=continuous&heartbeat=15000&include_docs=true&since=0
> is returning data like this:
> {
> 	"seq":356,
> 	"id":"encounter_83-20101218T133000.000-0700",
> 	"changes":[{"rev":"2-ada5250d09a364608db6cd639c213eae"}],
> 	"doc":{
> 		"_id":"encounter_83-20101218T133000.000-0700",
> 		"_rev":"2-ada5250d09a364608db6cd639c213eae",
> 		"location":{
> 			"organisation":{
> 				"name":"Some Org",
> 				"abbrev":"0"
> 			},
> 			"location":{
> 				"name":"Other Loc",
> 				"abbrev":"Othe"
> 			}
> 		},
> 		"comment":"Broken",
> 		"appointmentDateTime":"2010-12-18T13:30:00.000-07:00",
> ->		"patient_id":"patient_83",
> 		"appointmentType":"Acute",
> ->		"type":"encounter",
> ->		"patient_id":"patient_83",
> ->		"type":"encounter"
> 	}
> }
> You'll notice that the patient_id field and the type field, are being duplicated on the data return.  This is causing couchdb-lucene to baulk, but it's also just invalid json.
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (COUCHDB-995) Changes feed returns duplicate fields with include_docs=true

Posted by "Robert Newson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063966#comment-13063966 ] 

Robert Newson commented on COUCHDB-995:
---------------------------------------

I'm curious to know how couchdb-lucene reacts. Since it passes through a Python dict, I'd have expected duplicates to be dropped silently.

> Changes feed returns duplicate fields with include_docs=true
> ------------------------------------------------------------
>
>                 Key: COUCHDB-995
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-995
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Full-Text Search, HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: MacOSX with CouchDBX 1.0.1.1 as well as homebrew couchdb 1.0.1
>            Reporter: Luke Driscoll
>
> I ran in to a problem, when using couchdb-lucene; but the problem is with couch itself.  I've found this happening both on CouchDBX 1.0.1.1 and couchdb 1.0.1 (through homebrew).
> The problem is, if I update a document, and put in the same data each time, the data that comes out of the changes feed has duplicate fields.  The call: 
> http://localhost:5984/test/_changes?feed=continuous&heartbeat=15000&include_docs=true&since=0
> is returning data like this:
> {
> 	"seq":356,
> 	"id":"encounter_83-20101218T133000.000-0700",
> 	"changes":[{"rev":"2-ada5250d09a364608db6cd639c213eae"}],
> 	"doc":{
> 		"_id":"encounter_83-20101218T133000.000-0700",
> 		"_rev":"2-ada5250d09a364608db6cd639c213eae",
> 		"location":{
> 			"organisation":{
> 				"name":"Some Org",
> 				"abbrev":"0"
> 			},
> 			"location":{
> 				"name":"Other Loc",
> 				"abbrev":"Othe"
> 			}
> 		},
> 		"comment":"Broken",
> 		"appointmentDateTime":"2010-12-18T13:30:00.000-07:00",
> ->		"patient_id":"patient_83",
> 		"appointmentType":"Acute",
> ->		"type":"encounter",
> ->		"patient_id":"patient_83",
> ->		"type":"encounter"
> 	}
> }
> You'll notice that the patient_id field and the type field, are being duplicated on the data return.  This is causing couchdb-lucene to baulk, but it's also just invalid json.
> Thanks

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (COUCHDB-995) Changes feed returns duplicate fields with include_docs=true

Posted by "Luke Driscoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974032#action_12974032 ] 

Luke Driscoll commented on COUCHDB-995:
---------------------------------------

On further inspection it seems that ektorp is causing the problems, but wouldn't it make sense if couchdb rejected a save if you were sending json like
{
   'field1': 'a',
   'field1': 'b'
}

> Changes feed returns duplicate fields with include_docs=true
> ------------------------------------------------------------
>
>                 Key: COUCHDB-995
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-995
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Full-Text Search, HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: MacOSX with CouchDBX 1.0.1.1 as well as homebrew couchdb 1.0.1
>            Reporter: Luke Driscoll
>
> I ran in to a problem, when using couchdb-lucene; but the problem is with couch itself.  I've found this happening both on CouchDBX 1.0.1.1 and couchdb 1.0.1 (through homebrew).
> The problem is, if I update a document, and put in the same data each time, the data that comes out of the changes feed has duplicate fields.  The call: 
> http://localhost:5984/test/_changes?feed=continuous&heartbeat=15000&include_docs=true&since=0
> is returning data like this:
> {
> 	"seq":356,
> 	"id":"encounter_83-20101218T133000.000-0700",
> 	"changes":[{"rev":"2-ada5250d09a364608db6cd639c213eae"}],
> 	"doc":{
> 		"_id":"encounter_83-20101218T133000.000-0700",
> 		"_rev":"2-ada5250d09a364608db6cd639c213eae",
> 		"location":{
> 			"organisation":{
> 				"name":"Some Org",
> 				"abbrev":"0"
> 			},
> 			"location":{
> 				"name":"Other Loc",
> 				"abbrev":"Othe"
> 			}
> 		},
> 		"comment":"Broken",
> 		"appointmentDateTime":"2010-12-18T13:30:00.000-07:00",
> ->		"patient_id":"patient_83",
> 		"appointmentType":"Acute",
> ->		"type":"encounter",
> ->		"patient_id":"patient_83",
> ->		"type":"encounter"
> 	}
> }
> You'll notice that the patient_id field and the type field, are being duplicated on the data return.  This is causing couchdb-lucene to baulk, but it's also just invalid json.
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-995) Changes feed returns duplicate fields with include_docs=true

Posted by "Luke Driscoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974192#action_12974192 ] 

Luke Driscoll commented on COUCHDB-995:
---------------------------------------

That makes sense.

> Changes feed returns duplicate fields with include_docs=true
> ------------------------------------------------------------
>
>                 Key: COUCHDB-995
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-995
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Full-Text Search, HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: MacOSX with CouchDBX 1.0.1.1 as well as homebrew couchdb 1.0.1
>            Reporter: Luke Driscoll
>
> I ran in to a problem, when using couchdb-lucene; but the problem is with couch itself.  I've found this happening both on CouchDBX 1.0.1.1 and couchdb 1.0.1 (through homebrew).
> The problem is, if I update a document, and put in the same data each time, the data that comes out of the changes feed has duplicate fields.  The call: 
> http://localhost:5984/test/_changes?feed=continuous&heartbeat=15000&include_docs=true&since=0
> is returning data like this:
> {
> 	"seq":356,
> 	"id":"encounter_83-20101218T133000.000-0700",
> 	"changes":[{"rev":"2-ada5250d09a364608db6cd639c213eae"}],
> 	"doc":{
> 		"_id":"encounter_83-20101218T133000.000-0700",
> 		"_rev":"2-ada5250d09a364608db6cd639c213eae",
> 		"location":{
> 			"organisation":{
> 				"name":"Some Org",
> 				"abbrev":"0"
> 			},
> 			"location":{
> 				"name":"Other Loc",
> 				"abbrev":"Othe"
> 			}
> 		},
> 		"comment":"Broken",
> 		"appointmentDateTime":"2010-12-18T13:30:00.000-07:00",
> ->		"patient_id":"patient_83",
> 		"appointmentType":"Acute",
> ->		"type":"encounter",
> ->		"patient_id":"patient_83",
> ->		"type":"encounter"
> 	}
> }
> You'll notice that the patient_id field and the type field, are being duplicated on the data return.  This is causing couchdb-lucene to baulk, but it's also just invalid json.
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.