You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/09/06 06:21:20 UTC
[jira] [Commented] (BEAM-553) Add a custom text source
[ https://issues.apache.org/jira/browse/BEAM-553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15466569#comment-15466569 ]
ASF GitHub Bot commented on BEAM-553:
-------------------------------------
GitHub user chamikaramj opened a pull request:
https://github.com/apache/incubator-beam/pull/920
[BEAM-553] Adds a text source for Python SDK.
Current text source (fileio.TextFileSource) is specific to Dataflow runner. This adds a runner independent TextSource that is based on iobase.BoundedSource interface.
Adds a textio module that contains text source, text sink, and PTransforms that can be used to read and write text files.
Adds a significant number of tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/chamikaramj/incubator-beam text_source
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/920.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #920
----
commit bb1ff90307b563656a54731cada05e41cd9e82b8
Author: Chamikara Jayalath <ch...@google.com>
Date: 2016-08-30T01:08:46Z
Adds a text source to Python SDK.
----
> Add a custom text source
> ------------------------
>
> Key: BEAM-553
> URL: https://issues.apache.org/jira/browse/BEAM-553
> Project: Beam
> Issue Type: New Feature
> Components: sdk-py
> Reporter: Chamikara Jayalath
> Assignee: Chamikara Jayalath
>
> Currently, the text source implementation available for Python SDK [1] is a Dataflow native source which only works efficiently for Dataflow runner. We should add a custom text source on top of custom file-based source framework [2] so that other runner implementations can potentially use the same text source implementation.
> Custom text source implementation for Java SDK is at [3].
> [1] https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L70
> [2] https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/filebasedsource.py
> [3] https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L745
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)