You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Vidar Ingason (Jira)" <ji...@apache.org> on 2019/10/29 08:09:00 UTC

[jira] [Created] (ARROW-7018) Special characters as question mark in parquet files in R

Vidar Ingason created ARROW-7018:
------------------------------------

             Summary: Special characters as question mark in parquet files in R
                 Key: ARROW-7018
                 URL: https://issues.apache.org/jira/browse/ARROW-7018
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 0.15.0
         Environment: I'm running R on Windows 10
            Reporter: Vidar Ingason


Hello.
I'm new to the arrow package in R and I'm having a trouble regarding special characters (Icelandic). I have a large data set and everything is fine until I write the file to disk and read it in again (i.e. I use write_parquet() and then read_parquet()). When I read the data back in to R special characters turn into question mark. I.e. Veitingastaðir becomes Veitingasta�ir.

This does not happen when I use .csv.

Is there anything I can do when I write the .parquet file to disk or when I read it in to prevent this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)