You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2020/06/17 19:19:00 UTC

[jira] [Comment Edited] (ARROW-7018) [R] Special characters as question mark in parquet files

    [ https://issues.apache.org/jira/browse/ARROW-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138741#comment-17138741 ] 

Antoine Pitrou edited comment on ARROW-7018 at 6/17/20, 7:18 PM:
-----------------------------------------------------------------

Why do you have a latin1 locale in the first place? Is this on Windows?


was (Author: pitrou):
Why do you have a latin1 in the first place? Is this on Windows?

> [R] Special characters as question mark in parquet files
> --------------------------------------------------------
>
>                 Key: ARROW-7018
>                 URL: https://issues.apache.org/jira/browse/ARROW-7018
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 0.15.0
>         Environment: I'm running R on Windows 10
>            Reporter: Vidar Ingason
>            Assignee: Romain Francois
>            Priority: Critical
>             Fix For: 1.0.0
>
>
> Hello.
> I'm new to the arrow package in R and I'm having a trouble regarding special characters (Icelandic). I have a large data set and everything is fine until I write the file to disk and read it in again (i.e. I use write_parquet() and then read_parquet()). When I read the data back in to R special characters turn into question mark. I.e. Veitingastaðir becomes Veitingasta�ir.
> This does not happen when I use .csv.
> Is there anything I can do when I write the .parquet file to disk or when I read it in to prevent this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)