You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/11 21:13:00 UTC

[jira] [Commented] (BEAM-2774) Add I/O source for VCF files (python)

    [ https://issues.apache.org/jira/browse/BEAM-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200976#comment-16200976 ] 

ASF GitHub Bot commented on BEAM-2774:
--------------------------------------

GitHub user mhsaul opened a pull request:

    https://github.com/apache/beam/pull/3979

    [BEAM-2774] Add I/O source to read VCF files

    Added I/O transform, `ReadFromVcf`, to read VCF files into a `PCollection` of `Variant` objects. Modified `TextSource` to be able to process file headers to be used for VCF files.
    
    Design Doc: https://docs.google.com/document/d/1jsdxOPALYYlhnww2NLURS8NKXaFyRSJrcGbEDpY9Lkw/edit
    
    CC: @arostamianfar @chamikaramj @aaltay


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mhsaul/beam miles_saul--vsf-io-source

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/3979.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3979
    
----
commit 3f699fcd286c8509cfc404d7c2bec35fd6342347
Author: Miles Saul <ms...@msaul0.wat.corp.google.com>
Date:   2017-10-11T19:00:03Z

    Added vcf file io source and modified _TextSource to optionally handle headers

----


> Add I/O source for VCF files (python)
> -------------------------------------
>
>                 Key: BEAM-2774
>                 URL: https://issues.apache.org/jira/browse/BEAM-2774
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Asha Rostamianfar
>            Assignee: Miles Saul
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> A new I/O source for reading (and eventually writing) VCF files [1] for Python. The design doc is available at https://docs.google.com/document/d/1jsdxOPALYYlhnww2NLURS8NKXaFyRSJrcGbEDpY9Lkw/edit
> [1] http://samtools.github.io/hts-specs/VCFv4.3.pdf



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)