You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Michael Genereux <mg...@gmail.com> on 2010/03/27 00:09:17 UTC
Rails data to CouchDB
I was going to send the problem below in an email but instead I found
and am going to share the solution so others don't have to deal with
it.
--- Problem ---
I'm exporting a portion of a massive MySQL database that I'm thinking
is better suited to a CouchDB database. The MySQL database supports a
Rails application and the MySQL server is set to UTF8 for everything.
I'm using to_json in Rails that appears to convert the records just
fine to JSON. I get about 200 records converted and imported into
CouchDB and then the process dies with "Invalid UTF-8 JSON". One of
my fields in the offending record has the word "fĂȘte". The JSON
produced by Rails doesn't convert this character to the \u0000
notation. I don't think it should have to but maybe I'm not
understanding the standard.
--- Solution ---
The original importer of the data took ISO-8599-1 data and jammed it
into a UTF-8 field in the database. The character that I was having
problems with was being auto translated by the web browser as a kind
of ASCII/IOS-8859-1/Windows-1252 fallback on non UTF-8 characters. So
I could cut and paste the converted Unicode character from phpMyAdmin
right into Futon and the JSON was valid. The solution within Rails
was after I converted the Rails object to JSON, I ran "json_data =
Iconv.conv( 'utf-8', 'iso-8859-1', json_data )" to clean out bad
characters. Worked like a charm!