You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/10/20 16:13:03 UTC

[GitHub] [beam] bradmiro commented on pull request #12963: [BEAM-10983] Add getting started from Spark page

bradmiro commented on pull request #12963:
URL: https://github.com/apache/beam/pull/12963#issuecomment-712964924


   Generally, Spark RDDs are used for unstructured data, whereas Spark Dataframes are used for structured data as it can be supplied a schema. The transformations in this case tend to be more efficient. Does Beam have any sort of similar split, or would you want to use PCollections in both cases?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org