You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by deepak kumar <kd...@gmail.com> on 2019/09/13 02:46:58 UTC

How to debug dataflow locally

Hi All,
I am trying to come up with a framework that would help debugging dataflow
job end to end locally.
Dataflow can work with 100s of source and sinks.
Is there a framework already to setup these sources and sinks locally e.g.
if my dataflow reads from BQ and inserts to Bigtable.
Please let me know if someone is already doing this.
If ues then how.

Thanks
Deepak

Re: How to debug dataflow locally

Posted by Lukasz Cwik <lc...@google.com>.
In general there is no generic source/sink fake/mock/emulator that runs for
all sources/sinks locally. Configuration and data is on a case by case
basis. Most of the Apache Beam integration tests either launch a local
implementation that is specific to the source/sink (such as a DB for
JdbcIO) or use the real service and populate it with test data (such as
BigQuery).

You might want to take a look at the integration tests that are part of
Apache Beam to gather more details.


On Thu, Sep 12, 2019 at 7:47 PM deepak kumar <kd...@gmail.com> wrote:

> Hi All,
> I am trying to come up with a framework that would help debugging dataflow
> job end to end locally.
> Dataflow can work with 100s of source and sinks.
> Is there a framework already to setup these sources and sinks locally e.g.
> if my dataflow reads from BQ and inserts to Bigtable.
> Please let me know if someone is already doing this.
> If ues then how.
>
> Thanks
> Deepak
>