You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Alejandro Marco Ramos (Jira)" <ji...@apache.org> on 2022/07/13 19:35:00 UTC

[jira] [Created] (ARROW-17068) [Python] "pyarrow.parquet.write_to_dataset", option "file_visitor" nothing happen

Alejandro Marco Ramos created ARROW-17068:
---------------------------------------------

             Summary: [Python] "pyarrow.parquet.write_to_dataset", option "file_visitor" nothing happen
                 Key: ARROW-17068
                 URL: https://issues.apache.org/jira/browse/ARROW-17068
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 8.0.0
            Reporter: Alejandro Marco Ramos


When try to use the callback "file_visitor", nothing happens.

 

Example:
{code:java}
import pyarrow as pa
from pyarrow import parquet as pa_parquet

table = pa.table([
        pa.array([1, 2, 3, 4, 5]),
        pa.array(["a", "b", "c", "d", "e"]),
        pa.array([1.0, 2.0, 3.0, 4.0, 5.0])
    ], names=["col1", "col2", "col3"])

written_files = []
pa_parquet.write_to_dataset(table, partition_cols=["col2"], root_path="tests", file_visitor=lambda x: written_files.append(x.path)))

assert len(written_files) > 0  # This raises, length is 0{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)