You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Martin Thøgersen (Jira)" <ji...@apache.org> on 2022/01/28 12:57:00 UTC
[jira] [Updated] (ARROW-15494) [Docs] Clarify existing_data_behavior docstring
[ https://issues.apache.org/jira/browse/ARROW-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martin Thøgersen updated ARROW-15494:
-------------------------------------
Summary: [Docs] Clarify existing_data_behavior docstring (was: [Docs] Clarify {{existing_data_behavior}} docstring)
> [Docs] Clarify existing_data_behavior docstring
> -----------------------------------------------
>
> Key: ARROW-15494
> URL: https://issues.apache.org/jira/browse/ARROW-15494
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Documentation
> Affects Versions: 7.0.1
> Reporter: Martin Thøgersen
> Priority: Major
>
> Clarify wording slightly of \{{pyarrow.dataset.write_dataset()}} parameter {{existing_data_behavior}}
> [https://github.com/apache/arrow/blob/a27c55660e575a3987283d5d9e443642db48f215/python/pyarrow/dataset.py#L812-L827]
> Proposed wording:
> {noformat}
> existing_data_behavior : 'error' | 'overwrite_or_ignore' | \
> 'delete_matching'
> Controls how the dataset will handle data that already exists in
> the destination. The default behavior ('error') is to raise an error
> if any data exists in the `base_dir` destination.
> 'overwrite_or_ignore' will ignore any existing data and will
> overwrite files with the same name as an output file. Other
> existing files will be ignored. This behavior, in combination
> with a unique basename_template for each write, will allow for
> an append workflow.
> 'delete_matching' is useful when you are writing a partitioned
> dataset. The first time each partition leaf-level directory is
> encountered the entire leaf-level directory will be deleted. This
> allows you to overwrite old partitions completely.
> {noformat}
> I.e. clarify that:
> - {{error}} applies to the base_dir level.
> - {{delete_matching}} applies to the leaf-level directory.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)