You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Benjamin Kim <bb...@gmail.com> on 2016/09/14 01:08:28 UTC

Using Spark SQL to Create JDBC Tables

Has anyone created tables using Spark SQL that directly connect to a JDBC data source such as PostgreSQL? I would like to use Spark SQL Thriftserver to access and query remote PostgreSQL tables. In this way, we can centralize data access to Spark SQL tables along with PostgreSQL making it very convenient for users. They would not know or care where the data is physically located anymore.

By the way, our users only know SQL.

If anyone has a better suggestion, then please let me know too.

Thanks,
Ben
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: Using Spark SQL to Create JDBC Tables

Posted by ayan guha <gu...@gmail.com>.

I did not install myself, as it is part of Oracle's product, However, you
can bring in any SerDe yourself and add them to library. See this
<http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/>
blog for more information.

On Wed, Sep 14, 2016 at 2:15 PM, Benjamin Kim <bb...@gmail.com> wrote:

> Thank you for the idea. I will look for a PostgreSQL Serde for Hive. But,
> if you don’t mind me asking, how did you install the Oracle Serde?
>
> Cheers,
> Ben
>
>
>
> On Sep 13, 2016, at 7:12 PM, ayan guha <gu...@gmail.com> wrote:
>
> One option is have Hive as the central point of exposing data ie create
> hive tables which "point to" any other DB. i know Oracle provides there own
> Serde for hive. Not sure about PG though.
>
> Once tables are created in hive, STS will automatically see it.
>
> On Wed, Sep 14, 2016 at 11:08 AM, Benjamin Kim <bb...@gmail.com> wrote:
>
>> Has anyone created tables using Spark SQL that directly connect to a JDBC
>> data source such as PostgreSQL? I would like to use Spark SQL Thriftserver
>> to access and query remote PostgreSQL tables. In this way, we can
>> centralize data access to Spark SQL tables along with PostgreSQL making it
>> very convenient for users. They would not know or care where the data is
>> physically located anymore.
>>
>> By the way, our users only know SQL.
>>
>> If anyone has a better suggestion, then please let me know too.
>>
>> Thanks,
>> Ben
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>
>
> --
> Best Regards,
> Ayan Guha
>
>
>


-- 
Best Regards,
Ayan Guha

Re: Using Spark SQL to Create JDBC Tables

Posted by Benjamin Kim <bb...@gmail.com>.

Thank you for the idea. I will look for a PostgreSQL Serde for Hive. But, if you don’t mind me asking, how did you install the Oracle Serde?

Cheers,
Ben


> On Sep 13, 2016, at 7:12 PM, ayan guha <gu...@gmail.com> wrote:
> 
> One option is have Hive as the central point of exposing data ie create hive tables which "point to" any other DB. i know Oracle provides there own Serde for hive. Not sure about PG though.
> 
> Once tables are created in hive, STS will automatically see it. 
> 
> On Wed, Sep 14, 2016 at 11:08 AM, Benjamin Kim <bbuild11@gmail.com <ma...@gmail.com>> wrote:
> Has anyone created tables using Spark SQL that directly connect to a JDBC data source such as PostgreSQL? I would like to use Spark SQL Thriftserver to access and query remote PostgreSQL tables. In this way, we can centralize data access to Spark SQL tables along with PostgreSQL making it very convenient for users. They would not know or care where the data is physically located anymore.
> 
> By the way, our users only know SQL.
> 
> If anyone has a better suggestion, then please let me know too.
> 
> Thanks,
> Ben
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org <ma...@spark.apache.org>
> 
> 
> 
> 
> -- 
> Best Regards,
> Ayan Guha

Re: Using Spark SQL to Create JDBC Tables

Posted by ayan guha <gu...@gmail.com>.

One option is have Hive as the central point of exposing data ie create
hive tables which "point to" any other DB. i know Oracle provides there own
Serde for hive. Not sure about PG though.

Once tables are created in hive, STS will automatically see it.

On Wed, Sep 14, 2016 at 11:08 AM, Benjamin Kim <bb...@gmail.com> wrote:

> Has anyone created tables using Spark SQL that directly connect to a JDBC
> data source such as PostgreSQL? I would like to use Spark SQL Thriftserver
> to access and query remote PostgreSQL tables. In this way, we can
> centralize data access to Spark SQL tables along with PostgreSQL making it
> very convenient for users. They would not know or care where the data is
> physically located anymore.
>
> By the way, our users only know SQL.
>
> If anyone has a better suggestion, then please let me know too.
>
> Thanks,
> Ben
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

-- 
Best Regards,
Ayan Guha