You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by David Ciudad Gomez <da...@gmail.com> on 2021/09/29 06:45:22 UTC

Will Apache Beam adopt a Pandas-like syntax to program in Python?

Hi,

Apache Spark is adopting a new Pandas-like syntax (
https://github.com/databricks/koalas) for programming in Python. Will
Apache Beam adopt a similar syntax in the future?

Thanks and best regards.

David Ciudad

Re: Will Apache Beam adopt a Pandas-like syntax to program in Python?

Posted by Brian Hulette <bh...@google.com>.
Hi David,

Yes! Apache Beam now has a DataFrame API [1], which provides similar
functionality. It exited experimental in Beam 2.32.0 [2]. You can see some
example pipelines that use it here [3].

Brian

[1] https://beam.apache.org/documentation/dsls/dataframes/overview/
[2] https://beam.apache.org/blog/beam-2.32.0/
[3]
https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/dataframe

On Wed, Sep 29, 2021 at 12:14 PM David Ciudad Gomez <
david.ciudad.gomez@gmail.com> wrote:

> Hi,
>
> Apache Spark is adopting a new Pandas-like syntax (
> https://github.com/databricks/koalas) for programming in Python. Will
> Apache Beam adopt a similar syntax in the future?
>
> Thanks and best regards.
>
> David Ciudad
>