You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Nicola Crane (Jira)" <ji...@apache.org> on 2021/10/01 10:12:00 UTC
[jira] [Created] (ARROW-14190) [R] Should unify_schemas() allow
change of type?
Nicola Crane created ARROW-14190:
------------------------------------
Summary: [R] Should unify_schemas() allow change of type?
Key: ARROW-14190
URL: https://issues.apache.org/jira/browse/ARROW-14190
Project: Apache Arrow
Issue Type: Improvement
Components: R
Reporter: Nicola Crane
Should {{unify_schemas()}} be able to do schema evolution? If schemas with different (but compatible) types are combined using {{open_dataset()}}, this works, whereas if done via {{unify_schemas()}}, it results in an error.
See discussion here: https://github.com/apache/arrow-cookbook/pull/67#discussion_r714847220
{code:r}
library(dplyr)
library(arrow)
# Set up schemas
schema1 = schema(speed = int32(), dist = int32())
schema2 = schema(speed = float64(), dist = float64())
# Try to combine schemas via `unify_schemas()` - results in an error
unify_schemas(schema1, schema2)
## Error: Invalid: Unable to merge: Field speed has incompatible types: int32 vs double
## /home/nic2/arrow/cpp/src/arrow/type.cc:1609 fields_[i]->MergeWith(field)
## /home/nic2/arrow/cpp/src/arrow/type.cc:1672 AddField(field)
## /home/nic2/arrow/cpp/src/arrow/type.cc:1743 builder.AddSchema(schema)
# Create datasets with different schemas and read in via `open_dataset()`
cars1 <- Table$create(slice(cars, 1:25), schema = schema1)
cars2 <- Table$create(slice(cars, 26:50), schema = schema2)
td <- tempfile()
dir.create(td)
write_parquet(cars1, paste0(td, "/cars1.parquet"))
write_parquet(cars2, paste0(td, "/cars2.parquet"))
new_dataset <- open_dataset(td)
new_dataset$schema
# Schema
# speed: int32
# dist: int32
#
# See $metadata for additional Schema metadata
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)