You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Jérémie Bigras-Dunberry (Jira)" <ji...@apache.org> on 2021/07/31 20:09:00 UTC
[jira] [Updated] (BEAM-12701) Converting two deferred dataframes
to csv in the same pipeline causes PCollection label collision
[ https://issues.apache.org/jira/browse/BEAM-12701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jérémie Bigras-Dunberry updated BEAM-12701:
-------------------------------------------
Summary: Converting two deferred dataframes to csv in the same pipeline causes PCollection label collision (was: Converting two dataframe to_csv in the same pipeline causes PCollection label collision)
> Converting two deferred dataframes to csv in the same pipeline causes PCollection label collision
> --------------------------------------------------------------------------------------------------
>
> Key: BEAM-12701
> URL: https://issues.apache.org/jira/browse/BEAM-12701
> Project: Beam
> Issue Type: Bug
> Components: io-py-common
> Affects Versions: 2.31.0
> Reporter: Jérémie Bigras-Dunberry
> Priority: P2
>
>
> If you use the to_csv of the DeferredDataFrame twice in a single pipeline like this :
> {code:java}
> df1 = pd.DataFrame.from_records({"a":"b"}, index=[0])
> df2 = pd.DataFrame.from_records({"a":"b"}, index=[0])
> with beam.Pipeline() as p:
> df1 = to_dataframe(to_pcollection(df1, pipeline=p), label="df1")
> df2 = to_dataframe(to_pcollection(df2, pipeline=p), label="df2")
> df1.to_csv("test.csv")
> df2.to_csv("test2.csv"){code}
> You get this error on the second to_csv call
>
> {code:java}
> RuntimeError: A transform with label "ToPCollection(df)" already exists in the pipeline. To apply a transform with a specified label write pvalue | "label" >> transform
> {code}
> I think it comes from the fact that to_csv is calling a to_pcollection without any label, causing to infer an identical label for both to_csv function calls.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)