You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Alexey Romanenko (Jira)" <ji...@apache.org> on 2021/10/06 16:03:00 UTC

[jira] [Assigned] (BEAM-12730) Add custom delimiters to Python TextIO reads

     [ https://issues.apache.org/jira/browse/BEAM-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Romanenko reassigned BEAM-12730:
---------------------------------------

    Assignee: Dmitrii Kuzin

> Add custom delimiters to Python TextIO reads
> --------------------------------------------
>
>                 Key: BEAM-12730
>                 URL: https://issues.apache.org/jira/browse/BEAM-12730
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-py-common, io-py-files
>            Reporter: Daniel Oliveira
>            Assignee: Dmitrii Kuzin
>            Priority: P2
>              Labels: beginner, newbie, starter
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A common request by users is to be able to separate a text files read by TextIO with delimiters other than newline. The Java SDK already supports this feature.
> The current delimiter code is [located here|https://github.com/apache/beam/blob/v2.31.0/sdks/python/apache_beam/io/textio.py#L236] and defaults to newlines. This function could easily be modified to also handle custom delimiters. Changing this would also necessitate changing the API for the various TextIO.Read methods and adding documentation.
> This seems like a good starter bug for making more in-depth contributions to Beam Python.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)