You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Chetan Khatri <ch...@gmail.com> on 2019/12/12 00:14:24 UTC

How more than one spark job can write to same partition in the parquet file

Hi Spark Users,
would that be possible to write to same partition to the parquet file
through concurrent two spark jobs with different spark session.

thanks

Re: How more than one spark job can write to same partition in the parquet file

Posted by Iqbal Singh <iq...@gmail.com>.
Hey Chetan,

I have not got your question. Are you trying to write to a partition from
two actions ?? or you are looking for writing from two jobs. Except for
maintaining the state for the dataset completeness in that case, I dont see
any issues.

We are writing data to a Partition using two different actions in a single
spark job also partition here meant as a HDFS directory, not a hive
partition.



On Thu, Dec 12, 2019 at 1:37 AM ayan guha <gu...@gmail.com> wrote:

> We partitioned data logically for 2 different jobs...in our use case based
> on geography...
>
> On Thu, 12 Dec 2019 at 3:39 pm, Chetan Khatri <ch...@gmail.com>
> wrote:
>
>> Thanks, If you can share alternative change in design. I would love to
>> hear from you.
>>
>> On Wed, Dec 11, 2019 at 9:34 PM ayan guha <gu...@gmail.com> wrote:
>>
>>> No we faced problem with that setup.
>>>
>>> On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <
>>> chetan.opensource@gmail.com> wrote:
>>>
>>>> Hi Spark Users,
>>>> would that be possible to write to same partition to the parquet file
>>>> through concurrent two spark jobs with different spark session.
>>>>
>>>> thanks
>>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>> --
> Best Regards,
> Ayan Guha
>

Re: How more than one spark job can write to same partition in the parquet file

Posted by ayan guha <gu...@gmail.com>.
We partitioned data logically for 2 different jobs...in our use case based
on geography...

On Thu, 12 Dec 2019 at 3:39 pm, Chetan Khatri <ch...@gmail.com>
wrote:

> Thanks, If you can share alternative change in design. I would love to
> hear from you.
>
> On Wed, Dec 11, 2019 at 9:34 PM ayan guha <gu...@gmail.com> wrote:
>
>> No we faced problem with that setup.
>>
>> On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <
>> chetan.opensource@gmail.com> wrote:
>>
>>> Hi Spark Users,
>>> would that be possible to write to same partition to the parquet file
>>> through concurrent two spark jobs with different spark session.
>>>
>>> thanks
>>>
>> --
>> Best Regards,
>> Ayan Guha
>>
> --
Best Regards,
Ayan Guha

Re: How more than one spark job can write to same partition in the parquet file

Posted by Chetan Khatri <ch...@gmail.com>.
Thanks, If you can share alternative change in design. I would love to hear
from you.

On Wed, Dec 11, 2019 at 9:34 PM ayan guha <gu...@gmail.com> wrote:

> No we faced problem with that setup.
>
> On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <
> chetan.opensource@gmail.com> wrote:
>
>> Hi Spark Users,
>> would that be possible to write to same partition to the parquet file
>> through concurrent two spark jobs with different spark session.
>>
>> thanks
>>
> --
> Best Regards,
> Ayan Guha
>

Re: How more than one spark job can write to same partition in the parquet file

Posted by ayan guha <gu...@gmail.com>.
No we faced problem with that setup.

On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <ch...@gmail.com>
wrote:

> Hi Spark Users,
> would that be possible to write to same partition to the parquet file
> through concurrent two spark jobs with different spark session.
>
> thanks
>
-- 
Best Regards,
Ayan Guha