You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/13 10:11:16 UTC
[GitHub] [arrow] HaykManukyanAvetiky opened a new issue #12413: Pyarrow write dataset ignores delimiter
HaykManukyanAvetiky opened a new issue #12413:
URL: https://github.com/apache/arrow/issues/12413
Hi Guys.
I tried to report this bug/possible bug with jira but I failed so writing here.
I have a dataset and when I am trying to write it as tsv or tab separated file pyarrow anyway writes csv.
here is my code :
```python
ds.write_dataset(data=table, base_dir='adapter/tsv/',
basename_template='my-unique-name-{i}.tsv',
format=ds.CsvFileFormat(parse_options=csv.ParseOptions(delimiter="\t")),
partitioning=['month'],
existing_data_behavior='overwrite_or_ignore' )
```
here is what I am getting
```csv
"day","year"
26,1958
11,1912
26,1942
```
here is what I should get :
```tsv
day year
26 1958
11 1912
26 1942
```
IT feels like pyarrow ignoring format or delimiter
Thanks in advance
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] HaykManukyanAvetiky commented on issue #12413: Pyarrow write dataset ignores delimiter
Posted by GitBox <gi...@apache.org>.
HaykManukyanAvetiky commented on issue #12413:
URL: https://github.com/apache/arrow/issues/12413#issuecomment-1039915015
ok thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] HaykManukyanAvetiky closed issue #12413: Pyarrow write dataset ignores delimiter
Posted by GitBox <gi...@apache.org>.
HaykManukyanAvetiky closed issue #12413:
URL: https://github.com/apache/arrow/issues/12413
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] lidavidm commented on issue #12413: Pyarrow write dataset ignores delimiter
Posted by GitBox <gi...@apache.org>.
lidavidm commented on issue #12413:
URL: https://github.com/apache/arrow/issues/12413#issuecomment-1038302606
Unfortunately ParseOptions only applies to reading data; [WriteOptions](https://github.com/apache/arrow/blob/6b7c7a2702466f7c3c9c1f9dd41bc42458cff398/cpp/src/arrow/csv/options.h#L187) controls writing data. We don't currently support changing the delimiter when writing.
Please see https://issues.apache.org/jira/browse/ARROW-15672.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] lidavidm commented on issue #12413: Pyarrow write dataset ignores delimiter
Posted by GitBox <gi...@apache.org>.
lidavidm commented on issue #12413:
URL: https://github.com/apache/arrow/issues/12413#issuecomment-1038302606
Unfortunately ParseOptions only applies to reading data; [WriteOptions](https://github.com/apache/arrow/blob/6b7c7a2702466f7c3c9c1f9dd41bc42458cff398/cpp/src/arrow/csv/options.h#L187) controls writing data. We don't currently support changing the delimiter when writing.
Please see https://issues.apache.org/jira/browse/ARROW-15672.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] HaykManukyanAvetiky closed issue #12413: Pyarrow write dataset ignores delimiter
Posted by GitBox <gi...@apache.org>.
HaykManukyanAvetiky closed issue #12413:
URL: https://github.com/apache/arrow/issues/12413
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] HaykManukyanAvetiky commented on issue #12413: Pyarrow write dataset ignores delimiter
Posted by GitBox <gi...@apache.org>.
HaykManukyanAvetiky commented on issue #12413:
URL: https://github.com/apache/arrow/issues/12413#issuecomment-1039915015
ok thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org