You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Surendra , Manchikanti" <su...@gmail.com> on 2019/03/28 04:06:00 UTC

How to extract data in parallel from RDBMS tables

Hi All,

Is there any way to copy all the tables in parallel from RDBMS using Spark?
We are looking for a functionality similar to Sqoop.

Thanks,
Surendra

Re: How to extract data in parallel from RDBMS tables

Posted by Jason Nerothin <ja...@gmail.com>.
I can *imagine* writing some sort of DataframeReader-generation tool, but
am not aware of one that currently exists.

On Tue, Apr 2, 2019 at 13:08 Surendra , Manchikanti <
surendra.manchikanti@gmail.com> wrote:

>
> Looking for a generic solution, not for a specific DB or number of tables.
>
>
> On Fri, Mar 29, 2019 at 5:04 AM Jason Nerothin <ja...@gmail.com>
> wrote:
>
>> How many tables? What DB?
>>
>> On Fri, Mar 29, 2019 at 00:50 Surendra , Manchikanti <
>> surendra.manchikanti@gmail.com> wrote:
>>
>>> Hi Jason,
>>>
>>> Thanks for your reply, But I am looking for a way to parallelly extract
>>> all the tables in a Database.
>>>
>>>
>>> On Thu, Mar 28, 2019 at 2:50 PM Jason Nerothin <ja...@gmail.com>
>>> wrote:
>>>
>>>> Yes.
>>>>
>>>> If you use the numPartitions option, your max parallelism will be that
>>>> number. See also: partitionColumn, lowerBound, and upperBound
>>>>
>>>> https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html
>>>>
>>>> On Wed, Mar 27, 2019 at 23:06 Surendra , Manchikanti <
>>>> surendra.manchikanti@gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> Is there any way to copy all the tables in parallel from RDBMS using
>>>>> Spark? We are looking for a functionality similar to Sqoop.
>>>>>
>>>>> Thanks,
>>>>> Surendra
>>>>>
>>>>> --
>>>> Thanks,
>>>> Jason
>>>>
>>> --
>> Thanks,
>> Jason
>>
> --
Thanks,
Jason

Re: How to extract data in parallel from RDBMS tables

Posted by "Surendra , Manchikanti" <su...@gmail.com>.
Looking for a generic solution, not for a specific DB or number of tables.


On Fri, Mar 29, 2019 at 5:04 AM Jason Nerothin <ja...@gmail.com>
wrote:

> How many tables? What DB?
>
> On Fri, Mar 29, 2019 at 00:50 Surendra , Manchikanti <
> surendra.manchikanti@gmail.com> wrote:
>
>> Hi Jason,
>>
>> Thanks for your reply, But I am looking for a way to parallelly extract
>> all the tables in a Database.
>>
>>
>> On Thu, Mar 28, 2019 at 2:50 PM Jason Nerothin <ja...@gmail.com>
>> wrote:
>>
>>> Yes.
>>>
>>> If you use the numPartitions option, your max parallelism will be that
>>> number. See also: partitionColumn, lowerBound, and upperBound
>>>
>>> https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html
>>>
>>> On Wed, Mar 27, 2019 at 23:06 Surendra , Manchikanti <
>>> surendra.manchikanti@gmail.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> Is there any way to copy all the tables in parallel from RDBMS using
>>>> Spark? We are looking for a functionality similar to Sqoop.
>>>>
>>>> Thanks,
>>>> Surendra
>>>>
>>>> --
>>> Thanks,
>>> Jason
>>>
>> --
> Thanks,
> Jason
>

Re: How to extract data in parallel from RDBMS tables

Posted by Jason Nerothin <ja...@gmail.com>.
How many tables? What DB?

On Fri, Mar 29, 2019 at 00:50 Surendra , Manchikanti <
surendra.manchikanti@gmail.com> wrote:

> Hi Jason,
>
> Thanks for your reply, But I am looking for a way to parallelly extract
> all the tables in a Database.
>
>
> On Thu, Mar 28, 2019 at 2:50 PM Jason Nerothin <ja...@gmail.com>
> wrote:
>
>> Yes.
>>
>> If you use the numPartitions option, your max parallelism will be that
>> number. See also: partitionColumn, lowerBound, and upperBound
>>
>> https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html
>>
>> On Wed, Mar 27, 2019 at 23:06 Surendra , Manchikanti <
>> surendra.manchikanti@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> Is there any way to copy all the tables in parallel from RDBMS using
>>> Spark? We are looking for a functionality similar to Sqoop.
>>>
>>> Thanks,
>>> Surendra
>>>
>>> --
>> Thanks,
>> Jason
>>
> --
Thanks,
Jason

Re: How to extract data in parallel from RDBMS tables

Posted by "Surendra , Manchikanti" <su...@gmail.com>.
Hi Jason,

Thanks for your reply, But I am looking for a way to parallelly extract
all the tables in a Database.


On Thu, Mar 28, 2019 at 2:50 PM Jason Nerothin <ja...@gmail.com>
wrote:

> Yes.
>
> If you use the numPartitions option, your max parallelism will be that
> number. See also: partitionColumn, lowerBound, and upperBound
>
> https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html
>
> On Wed, Mar 27, 2019 at 23:06 Surendra , Manchikanti <
> surendra.manchikanti@gmail.com> wrote:
>
>> Hi All,
>>
>> Is there any way to copy all the tables in parallel from RDBMS using
>> Spark? We are looking for a functionality similar to Sqoop.
>>
>> Thanks,
>> Surendra
>>
>> --
> Thanks,
> Jason
>

Re: How to extract data in parallel from RDBMS tables

Posted by Jason Nerothin <ja...@gmail.com>.
Yes.

If you use the numPartitions option, your max parallelism will be that
number. See also: partitionColumn, lowerBound, and upperBound

https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html

On Wed, Mar 27, 2019 at 23:06 Surendra , Manchikanti <
surendra.manchikanti@gmail.com> wrote:

> Hi All,
>
> Is there any way to copy all the tables in parallel from RDBMS using
> Spark? We are looking for a functionality similar to Sqoop.
>
> Thanks,
> Surendra
>
> --
Thanks,
Jason