You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Micah Kornfield (Jira)" <ji...@apache.org> on 2021/01/23 03:49:00 UTC
[jira] [Created] (ARROW-11353) [C++][Python][Parquet] We should
allow for overriding to large types by providing a schema
Micah Kornfield created ARROW-11353:
---------------------------------------
Summary: [C++][Python][Parquet] We should allow for overriding to large types by providing a schema
Key: ARROW-11353
URL: https://issues.apache.org/jira/browse/ARROW-11353
Project: Apache Arrow
Issue Type: Bug
Components: C++, Python
Reporter: Micah Kornfield
{{The following shouldn't throw}}
{{>>> import pyarrow as pa}}
{{>>> import pyarrow.parquet as pq}}
{{>>> import pyarrow.dataset as ds}}
{{>>> pa.__version__}}
{{'2.0.0'}}
{{>>> schema = pa.schema([pa.field("utf8", pa.utf8())])}}
{{>>> table = pa.Table.from_pydict(\{"utf8": ["foo", "bar"]}, schema)}}
{{>>> pq.write_table(table, "/tmp/example.parquet")}}
{{>>> large_schema = pa.schema([pa.field("utf8", pa.large_utf8())])}}
{{>>> ds.dataset("/tmp/example.parquet", schema=large_schema,}}
{{format="parquet").to_table()}}
{{Traceback (most recent call last):}}
{{ File "<stdin>", line 1, in <module>}}
{{ File "pyarrow/_dataset.pyx", line 405, in}}
{{pyarrow._dataset.Dataset.to_table}}
{{ File "pyarrow/_dataset.pyx", line 2262, in}}
{{pyarrow._dataset.Scanner.to_table}}
{{ File "pyarrow/error.pxi", line 122, in}}
{{pyarrow.lib.pyarrow_internal_check_status}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: fields had matching names but differing types.}}
{{From: utf8: string To: utf8: large_string}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)