You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/09/12 17:19:20 UTC
[jira] [Commented] (BEAM-625) Make Dataflow Python Materialized
PCollection representation more efficient
[ https://issues.apache.org/jira/browse/BEAM-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15484680#comment-15484680 ]
ASF GitHub Bot commented on BEAM-625:
-------------------------------------
GitHub user katsiapis opened a pull request:
https://github.com/apache/incubator-beam/pull/946
[BEAM-625] Making Dataflow Python Materialized PCollection representation more efficient (4 of several).
- Refactoring code in avroio.py to allow for re-use.
- Making sure that _AvroUtils validates the sync_marker.
This should detect corrupted or not-properly formatted AVRO files.
- Simplifying block reading.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/katsiapis/incubator-beam python-sdk
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/946.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #946
----
commit cbe928c2d6d2cb79adecde615f3c2d86152dae2d
Author: Gus Katsiapis <ka...@katsiapis-linux.mtv.corp.google.com>
Date: 2016-09-12T17:11:44Z
Refactorings and enhancements in avroio to allow for reuse.
----
> Make Dataflow Python Materialized PCollection representation more efficient
> ---------------------------------------------------------------------------
>
> Key: BEAM-625
> URL: https://issues.apache.org/jira/browse/BEAM-625
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py
> Reporter: Konstantinos Katsiapis
> Assignee: Frances Perry
>
> This will be a several step process which will involve adding better support for compression as well as Avro.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)