You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "abhilash.kr" <ab...@gmail.com> on 2021/05/14 17:15:54 UTC

Multiple destination single source

I have a single source of data. The processing of records have to be directed
to multiple destinations. i.e

1. read the source data
2. based on condition route to the following sources
    1. Kafka for error records
    2. store success records with certain condition in s3 bucket, bucket
name : "A", folder - "a"
    4. store success records with certain condition in s3 bucket, bucket
name : "A", folder - "b"
    3. store success records with certain condition in a different s3 bucket

How can I achieve this in pyspark?

Are there any resources on the design patterns or common industry followed
architectural patterns for apache spark? 

|   Source    | Destination |
| -----------  | -----------  |
| Single       | Single        |
| Multiple     | Single        |
| Single       | Multiple      |
| Multiple     | Multiple      |



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org