You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Zsolt Kegyes-Brassai (Jira)" <ji...@apache.org> on 2022/04/25 13:04:00 UTC

[jira] [Created] (ARROW-16318) Timezone is not supported by to_duckdb()

Zsolt Kegyes-Brassai created ARROW-16318:
--------------------------------------------

             Summary: Timezone is not supported by to_duckdb()
                 Key: ARROW-16318
                 URL: https://issues.apache.org/jira/browse/ARROW-16318
             Project: Apache Arrow
          Issue Type: Bug
    Affects Versions: 7.0.0
            Reporter: Zsolt Kegyes-Brassai


Here is a reproducible example:

 
{code:java}
library(tidyverse)
library(arrow)

df1 <- tibble(time = lubridate::now(tzone = "UTC"))
str(df1)
#> tibble [1 x 1] (S3: tbl_df/tbl/data.frame)
#>  $ time: POSIXct[1:1], format: "2022-04-25 12:50:10"
write_dataset(df1, here::here("temp/df1"), format = "parquet")
open_dataset(here::here("temp/df1")) |> 
  to_duckdb()
#> Error: duckdb_prepare_R: Failed to prepare query SELECT *
#> FROM "arrow_001" AS "q01"
#> WHERE (0 = 1)
#> Error: Not implemented Error: Unsupported Internal Arrow Type tsu:UTC

df2 <- tibble(time = lubridate::now())
str(df2)
#> tibble [1 x 1] (S3: tbl_df/tbl/data.frame)
#>  $ time: POSIXct[1:1], format: "2022-04-25 14:50:11"
write_dataset(df2, here::here("temp/df2"), format = "parquet")
open_dataset(here::here("temp/df2")) |> 
  to_duckdb()
#> # Source:   table<arrow_002> [?? x 1]
#> # Database: duckdb_connection
#>   time               
#>   <dttm>             
#> 1 2022-04-25 12:50:11
{code}
 

The timestamps without timezone information are working fine.

How one can remove easily the timezone information from {{timestamp }}type column from a parquet dataset?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)