You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Eugene Nikolaiev (Jira)" <ji...@apache.org> on 2021/10/24 13:02:00 UTC

[jira] [Comment Edited] (BEAM-12730) Add custom delimiters to Python TextIO reads

    [ https://issues.apache.org/jira/browse/BEAM-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433437#comment-17433437 ] 

Eugene Nikolaiev edited comment on BEAM-12730 at 10/24/21, 1:01 PM:
--------------------------------------------------------------------

[~dmitrii_kuzin] would you mind adding the custom delimiter support to the {{ReadAllFromText}} transform as well for completeness?

[~danoliveira] The {{WriteToText}} transform would also benefit from custom delimiter support, Java SDK has this. This sounds like a separate ticket.


was (Author: eugenenikolaiev):
[~dmitrii_kuzin] would you mind adding the custom delimiter support to the {{ReadAllFromText }}transform as well for completeness?

[~danoliveira] The {{WriteToText}} transform would also benefit from custom delimiter support, Java SDK has this. This sounds like a separate ticket.

> Add custom delimiters to Python TextIO reads
> --------------------------------------------
>
>                 Key: BEAM-12730
>                 URL: https://issues.apache.org/jira/browse/BEAM-12730
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-py-common, io-py-files
>            Reporter: Daniel Oliveira
>            Assignee: Dmitrii Kuzin
>            Priority: P2
>              Labels: beginner, newbie, starter
>          Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> A common request by users is to be able to separate a text files read by TextIO with delimiters other than newline. The Java SDK already supports this feature.
> The current delimiter code is [located here|https://github.com/apache/beam/blob/v2.31.0/sdks/python/apache_beam/io/textio.py#L236] and defaults to newlines. This function could easily be modified to also handle custom delimiters. Changing this would also necessitate changing the API for the various TextIO.Read methods and adding documentation.
> This seems like a good starter bug for making more in-depth contributions to Beam Python.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)