You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "David Li (Jira)" <ji...@apache.org> on 2022/10/14 14:53:00 UTC
[jira] [Created] (ARROW-18060) [C++] Writing a dataset with 0 rows doesn't create any files
David Li created ARROW-18060:
--------------------------------
Summary: [C++] Writing a dataset with 0 rows doesn't create any files
Key: ARROW-18060
URL: https://issues.apache.org/jira/browse/ARROW-18060
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Affects Versions: 9.0.0
Reporter: David Li
If the input data has no rows, no files get created. This is potentially unexpected as it looks like "nothing happened". It might be nicer to create an empty file. With partitioning, though, that then gets weird (there's no partition values) so maybe an error might make more sense instead.
Reproduction in Python
{code:python}
import tempfile
from pathlib import Path
import pyarrow
import pyarrow.dataset
print("PyArrow version:", pyarrow.__version__)
table = pyarrow.table([
[],
], schema=pyarrow.schema([
("ints", "int64"),
]))
with tempfile.TemporaryDirectory() as d:
pyarrow.dataset.write_dataset(table, d, format="feather")
print(list(Path(d).iterdir()))
{code}
Output
{noformat}
> python repro.py
PyArrow version: 9.0.0
[] {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)