You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Zsolt Kegyes-Brassai (Jira)" <ji...@apache.org> on 2022/04/25 13:04:00 UTC
[jira] [Created] (ARROW-16318) Timezone is not supported by to_duckdb()
Zsolt Kegyes-Brassai created ARROW-16318:
--------------------------------------------
Summary: Timezone is not supported by to_duckdb()
Key: ARROW-16318
URL: https://issues.apache.org/jira/browse/ARROW-16318
Project: Apache Arrow
Issue Type: Bug
Affects Versions: 7.0.0
Reporter: Zsolt Kegyes-Brassai
Here is a reproducible example:
{code:java}
library(tidyverse)
library(arrow)
df1 <- tibble(time = lubridate::now(tzone = "UTC"))
str(df1)
#> tibble [1 x 1] (S3: tbl_df/tbl/data.frame)
#> $ time: POSIXct[1:1], format: "2022-04-25 12:50:10"
write_dataset(df1, here::here("temp/df1"), format = "parquet")
open_dataset(here::here("temp/df1")) |>
to_duckdb()
#> Error: duckdb_prepare_R: Failed to prepare query SELECT *
#> FROM "arrow_001" AS "q01"
#> WHERE (0 = 1)
#> Error: Not implemented Error: Unsupported Internal Arrow Type tsu:UTC
df2 <- tibble(time = lubridate::now())
str(df2)
#> tibble [1 x 1] (S3: tbl_df/tbl/data.frame)
#> $ time: POSIXct[1:1], format: "2022-04-25 14:50:11"
write_dataset(df2, here::here("temp/df2"), format = "parquet")
open_dataset(here::here("temp/df2")) |>
to_duckdb()
#> # Source: table<arrow_002> [?? x 1]
#> # Database: duckdb_connection
#> time
#> <dttm>
#> 1 2022-04-25 12:50:11
{code}
The timestamps without timezone information are working fine.
How one can remove easily the timezone information from {{timestamp }}type column from a parquet dataset?
--
This message was sent by Atlassian Jira
(v8.20.7#820007)